13 Oct 2005 bwh   » (Master)

Testing, Open Source, and NFSv4

I gave a presentation today at the Pacific Northwest Software Quality Conference on my work with organising testing efforts for the Linux NFSv4 community.

PNSQC is a pretty heavily industry focused conference (i.e., nearly all Windows), but it was quite interesting to see that they devoted one of their five tracks to Open Source.

Anyway, one of the questions that came up during my talk was really thought provoking. At one of the earlier talks they'd made a point that that the dark side to fixing a bug in software is that it "leaves scars". By this, what they mean is that when a proprietary application is maintained, it gains #ifdefs and other hacky fixes to issues. In that particular talk, their point was that the best way to make good, long lasting software is to not introduce defects into software in the first place, by adopting development methodologies, coding practices, and testing approaches that generate better code from the start. For example, writing tests before writing the code, shifting metrics to not inadvertantly make high bug counts desireable (e.g. ranking testers on how many bugs they find), etc.

Anyway, in my presentation I was contrasting the proprietary model, where users don't see the program until it is polished and "finished", with open source, where your first release is generally buggy and incomplete - but it works, and over time, given an adequate number of actively involved users, it gets better and better.

So someone asked about why it is that Open Source doesn't suffer from this scarring phenomenon that they're accustomed to with proprietary software. Why is it that instead of turning into the proverbial "big ball of mud", an open source project seems to just get _better_ with age? It doesn't suffer from scarring to the degree that proprietary software does, and the bigger and older it gets, the better. Apache, MySQL, Linux, gcc... all the grand daddy's of open source are considered some of the best quality software in open source (and indeed of the whole software industry, according to some studies), but they're ancient compared with their proprietary fellows.

This is a pretty profound observation. I'd love to see someone study this in more detail and see how true it is. A lot of the data for open source quality are snapshots in time, comparing an open source app with the corresponding proprietary one.

Regardless of *why* this happens, if the phenomenon is as true as it appears, it has some pretty intriguing implications. It means that as we go forward, the scales are only going to tilt further and further into Open Source's favor. Proprietary-only software companies will have to work harder and harder to stay ahead.

It also suggests that there is an advantage to getting into Open Source sooner. If the longer your application is available as OSS, the better it gets, then the sooner you open source it, the more time you'll have to accumulate the benefits. Assuming you have a good, active community around it, and that you have a solid architecture and so forth, this could ensure your application will have a long and successful life, and out compete others in the long term.

Of course, this still leaves that question. Why doesn't OSS suffer from "scarring"? I think there's probably several factors that play into this:

First, "maintenance" in open source software isn't really treated that much differently than regular development. If a bug requires a major refactoring of the codebase in order to close it, then rather than simply slapping a hacky work around and forget it, if the maintainer has the time and fortitude, he *does* that major refactoring. It may destabilize the software for a while, but it closes the bug (and often cleans up the code significantly).

There is also a "redo" effect. I suspect there are some gifted programmers who do everything perfectly on the first try, but most of us get it right maybe after the third or fifth try. With a proprietary application, you may only be allocated enough time to do it once, or maybe twice. In open source, there's really no limit to the number of times that something can be redone, and in fact you see people redoing things a LOT. This is bad in the sense of leading to churn, but I imagine it's one way to slough off a LOT of scar tissue.

A third reason is the people. In proprietary software, the testers, maintainers, and developers tend to ebb and flow. The developer that created the highly successful wizbang algorithm is reassigned to some new business critical job (or hired by some other company), and the maintenance of his code handed over to someone else (possibly someone less experienced). This new person may not have the depth of visceral understanding of the code that the original developer did. In the prototypical Open Source project, however, no one gets "reassigned". You might get bored or burnt out, but it's quite common for you to keep tabs on the program for years to come, answering questions as needed or even helping out in architecture issues with your code.

Another effect that I suspect probably helps Open Source avoid "scarring" is what I think of as the coding parallel to Wikipedia's "copyediting". In Wikipedia, there are folks who rather than writing a lot of good articles, just sort of wander around and obsessively clean up other people's work, fixing typos, correcting grammar, and so forth. I think this phenomenon probably also happens with large Open Source software projects. People with an obsession about security flaws habitually look through code for buffer overflows or injection points. People who are anal about coding standards go through and fix tabs and braces. Lots of little things that individually don't seem that important, but together can have a non-trivial impact on the software quality.

At my talk, several of the audience members shared their own ideas for the reasons. One who'd had a lot of experience with gcc pointed out that some projects have extremely rigorous review processes, that keep out bad code from the start. Another pointed out that since the code is public, you have a lot more motivation to make it good than you would in closed software; after all, WHO KNOWS who might look at it, so you better do it as good as you can.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!