19 Mar 2004 crhodes   » (Master)

News from the trenches: more speed!

No-one doubts the utility of regression test suites to verify that a given change doesn't cause more problems than it fixes (well, most people don't, but the cmucl team still seems to be in the stone age on this issue). Regression tests, and conformance tests in addition (thank you pfdietz), allow developers to manage the complexity explosion involved in interactions between areas of functionality.

While one of the guiding principles of sbcl is maintainability (considered at least by me as more important than speed), it is nonetheless important to know the performance impact of changes, even when such changes are motivated by correctness issues. In a similar way to regression tests, the complexity of a large system is such that an apparently innocuous change can have a large effect on the performance characteristics of a priori unrelated areas.

It isn't quite so important to know before making checkins the performance delta of a change, but it is important to be able to find and fix regressions before too long. The problem until now is that there has been no systematic collection of data — so it was quite hard to deal with performance issues objectively.

Now, with (other) developers having reasonably fast computers, it is possible to keep track of these things, given a decent benchmark suite. I'm not sure that Eric Marsden's cl-bench suite is “decent”, but it's the best we've got. So thank you to Andreas Fuchs, who is not only tracking CVS HEAD on Eric's benchmarks, but has also collected very interesting historical data (SBCL releases on that latter series of graphs are approximately monthly).

What's doubly interesting about all that historical data is that not only can we identify regressions, as I've indicated above, but we can also use statistical methods to estimate similarities in codepaths covered by the individual benchmarks. Hierarchical clustering (on which more later, I hope) even on fairly naïve distance measures, reveals some interesting similarities. Graph layout in that plot courtesy of McCLIM's graph formatting protocol and a couple of home-rolled methods (which I have since improved to give a clearer dendrogram). Cool beans.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!