Moving a Project From Subversion to Git
I’ve been interested in the stuff that I’ve heard about git since hearing that Linus Torvalds was ditching BitKeeper and writing his own replacement to handle version control for the Linux kernel. I’ve taken a look at it a few different times as it’s profile grew in the Rails/Perl/etc communities.
Every time, there was something that kept me from adopting it: the state of Windows support. And, after this past week, I think that my focus on the msysGit version was part of the problem. I have been reluctant to use the Cygwin version. That is mostly because I didn’t want to *have* to install all of cygwin on any workstation where I want git support and the pile of files that adds.
The thing is, though, that when I took a good look last week, I already had cygwin running on every single Windows machine I’ve got, INCLUDING virtual machines. Never one to cling to a supposition after evidence invalidates it, I gave git on cygwin a shot.
Because I still have lots of Subversion repositories and am likely to continue using it on lots of projects, I insisted on getting the git/Subversion integration working. Also, I use a Dreamhost account to host my repositories and any project-related forums, project tracking, etc., as a sort of private Sourceforge/Codeplex/Google Code without any licensing issues to deal with. I wanted to add git support to that existing project space.
It turned out like most projects of this sort. The stuff I thought would be a pain (like getting git running on the Dreamhost account) turned out to be documented and straightforward, while stuff like getting the “git svn” command to work (which should just work off the cygwin install) thwarted me for far too long.
However, after I worked out the kinks, I was able to get it working and pointed at some of my Subversion repositories on several of my workstations. After cloning those repositories, what I'm most suprised by (even though I was told to expect it) is how git manages to copy ALL of the version history for the entire repository into a single .git directory (instead of hundreds or thousands of .svn directories) and end up with something SMALLER than the single version checkout of Subversion for the same source tree.
As an example, one of my repositories had 440 revisions across 10 branches. A checked out copy from Subversion of the current release branch is 2.51GB in 43,655 files across 23,361 folders.
After moving it to git, the checked out copy, which now includes the ENTIRE 440 revision archive of project history is 1.51GB in 15,356 files and 2,233 folders.
As far as the hiccups, it boiled down to a couple of things:
- Make sure to install all of the “git” related packages in cygwin.
- Install the subversion libraries for perl (they’re not anywhere near the “git” packages in the cygwin installer). This took care of it complaining about the SVN CPAN Perl library.
- Be prepared to have to run “rebaseall” on the cygwin installation. This took care of a LOT of cygwin complaints about various DLL’s.
- Running “git svn clone http://example.com/repos/” can be agonizingly slow on large repositories. It’s not an issue if you’re working off of a tiny project, but that 1.5GB project took all night and then some. Ongoing work is quite fast, so I’m hoping they work the kinks out of this going forward or that they’ll ditch the tagline “The fast version control system”.
Given how efficient it is with storage, I’m interested in experimenting with running it in the background on my development directories and having a FileSystemWatcher commit whenever things change. I’d like to see how long it takes before things get out of control.
I’m also hoping that the GUI tools catch up. That’s not so much because I want them for myself (I prefer commandline for SVN as well), but because TortoiseSVN has been key in getting SVN adopted on projects and flushing SourceSafe on several of my projects.




