25 Jun 2008 apenwarr   » (Master)

2008-06-24: Weird things about git, #1: premature optimization

<!-- start of entry 200806/24 --> Weird things about git, #1: premature optimization

I've been hanging out on the git mailing list for the last little while, and a few things have been striking me as weird. There are hundreds of people on that list, and lots of them are active contributors. Why? It's just a stupid version control system after all.

But there must be some reason. I've thought of a few. Here's one for today:

The git developers are completely obsessed with performance. Apparently nobody there has ever heard that "premature optimization is the root of all evil." (Incidentally, the ACM has a recent article explaining why that statement is a fallacy.)

It's enlightening just to watch the git developers optimize every possible part of their system: syscall overhead, memory allocation, path trees, file size, network turnarounds, etc. And even while they optimize the heck out of everything, they still complain about how inefficient it all still is. Which it is, of course, if you follow the discussions.

Also interesting is that large parts of git are written in sh and perl and it's still fast, because while they obsess about performance, they also know which parts actually matter to performance. It helps to be a Linux kernel developer sometimes, I guess.

There's no doubt that git's optimization is premature: it started out way faster than svn, and it gets even faster with each release. Is that really necessary? Of course not. Everyone was doing just fine with things like svn. But life is just so much better when programs go unnecessarily fast. It's a very strange sensation; pain you didn't realize you were having just goes away. I've been doing a lot of Windows development lately, and let me tell you, Windows developers have a whole lot of pain that they don't realize they have, because it's not overt. Windows development is mostly fine, really. But what kills it is all the little random delays, the crashes, the giant memory-leaky .Net-based IDEs that make you point-and-click fifty things sequentially once a week because you don't have a way to write a script to do it for you. Just give me a text editor and 'make' any day, thanks.

What was my point again? Oh right, unnecessary optimization. While I've been writing this article, I've been waiting for svn to finish committing 20000 tiny files to a fresh svn repository. It's still not done. (Watching a strace of this is hilarious. Among other things, it's creating an XML file for every single one.)

...

And for comparison, git just did the same thing in 23 seconds. (About half that time was spent running my shell script to create 20000 files.)

Who cares? How often do you create a repo with 20000 tiny files in it anyway? Almost never, of course. It's not an important optimization goal.

...

svn finally finished. See? That wasn't so bad. <!-- end of entry 200806/24 -->

Syndicated 2008-06-25 04:02:54 (Updated 2008-06-25 05:05:41) from apenwarr - Business is Programming

Latest blog entries     Older blog entries

New Advogato Features

FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!