Software Quality: Why Windows 2000 has 63,000 "bugs" and Linux* doesn't
Posted 29 Mar 2000 at 03:39 UTC by ajv 
In Redmond, every two developers have a dedicated tester who's sole job
is to ruin the programmer's day by doing nasty things to their code. In
the OSS world, there are far fewer programmers per project, with few
developers masquerading as testers (aka code busters). OSS software
quality methodology, tools, and thus the levels of automated or manual
testing is simply non-existant, and yet the software seems to hang
together. Why is that?
I'm currently helping pnm2ppa to become
a
stable, fast and internationalized print processor for HP printers.
pnm2ppa has a small itinerant band of four lead developers. At the
moment, we're down to just two. HP had a small horde of developers (at
least
five for the driver portion alone that I can determine from the
documentation we scrounged from HP's search engines).
Our project suffers from the lack of an automated test suite.
Before
I moved development of pnm2ppa to SourceForge, we had no bug tracking
system and no directed way to fix issues, and no CVS tree (so no ability
to regress new bugs). I tried looking at DejaGNU but it is suffering
from a lack of attention and is difficult to integrate into our
(admittedly crufty) code base.
The OSS community is starting to look at this important issue
with
projects like Software Carpentry, but I will make a bet with anyone that
SQA tools will not feature prominently in the competition winners, nor
will they be developed to the rarified heights available in the
commercial world. It's just not sexy enough.
For the reasons that the Mozilla project can track 6504
current
bugs
(as of the time of writing), and Microsoft know of 23,000+ reasons why
they shouldn't have shipped on Feb 17, no distribution I'm aware of has
any idea how many bugs are in the kernel, let alone the 1000+ packages
they give you on the same install CD. See Bad Software for reasons why this
is a bad idea. When you sell something, the consumer should expect and
should demand that that what they've bought actually works. Knowingly
shipping stuff with bugs should be illegal, and every customer should be
entitled to free bug fixes (no service contracts, and no identity
farming) for as long as the customer uses the product (ie not when buggy
version+1 comes out). This should be law, not UCITA. I really don't know
why the average person on the street accepts current software quality.
It is truly amazing to me that Linux and all the packages
that
make
your average distribution are so stable. The stability is certainly not
due to testing. I'm still stumped as to the reason beyond the most
obvious one: things that are used (dogfood) are fixed. But what about
the stuff that's not used, but you may one day need to work? The usual
cry: is fix it yourself. But in the case of many orpaned projects,
there's no one to submit a patch to, let alone get it fixed for the next
person who needs the same bug fixed.
The nub of this article is not that your average Linux
distribution
has bugs (which is not news to any of us), it is that OSS projects need
heavy duty SQA tools now. They
need developers like you and me to understand the SQA process, write
tests in our projects as we go along and make it easy for ourselves and
our users to run automated regression tests on a regular basis. Then we
might have an idea about the number of bugs and their severity that are
actually in your average distribution. I'm certain that it's higher than
the 63,000 issues Microsoft found in Windows 2000.
* To me "Linux" means the kernel. To the person on the street,
"Linux" means a distribution, which is the kernel plus a couple of
thousand packages.
Lose 2000, posted 29 Mar 2000 at 06:26 UTC by ole »
(Journeyer)
Someone suggested that Microsoft counted the Lose 2000 bugs in an unsigned short variable.
I can't help but refute a few points in this article. I'm going to use
the Debian distribution as the source of most of my numbers, since I'm
familiar with it.
In the OSS world, there are far fewer programmers per project, with few
developers masquerading as testers (aka code busters).
Microsoft employs around 32
thousand
people
(source). If
we (very charatibly I think; it's a big company and surely has sizable
management, support, marketing, and sales teams) assume that 1 out of 4
is involved in software development, that's 8 thousand. Taking your
statement that
"In Redmond, every two developers have a dedicated tester who's sole
job is to ruin the programmer's day" as a literal fact, although you
provided no supporting evidence, 1/3 of those are testers, leaving 5333
programmers.
Now granted, there are less than
5333
programmers
working on
say,
the
Debian probject. In fact, there are a factor of ten less; between 500
and 300.
And granted, there are less than
5333
programmers
working on
the
Linux
kernel; the credits file lists a mere 284.
And granted, the average free
software
project
out
there
is
much
smaller, and your examples of a team of 2 to 4 is probably not
uncommon.
So, Debian with its 2700 peices
of
free
software,
uses
work
from
say, 3
* 2700 = 8100 people. Add in 284 and 300 from the kernel and the folks
who work on debian itself, and we hit a lower bound estimate of 8684
programmers who are working on projects that are part of the Debian
distribution.
I suspect the actual number is
far,
far
higher
than
that, by at least one order of magnatude but we've
already easily topped microsoft's 5333 programmers.
Fewer programmers per project
doesn't
matter, if
the
projects
are
small
and well-defined, as projects in the free software world tend to be. Two
thousand teams of 2 to 4 people working on two thousand individual
projects scales much better than 5 thousand people working on 3 or so
huge and ill-defined projects (IE, NT, Word, etc). See the Mythical
Man Month.
with few developers masquerading as testers (aka code busters)
In Redmond, every two developers
have
a
dedicated
tester
who's
sole
job is to ruin the programmer's day by doing nasty things to their
code
This seems to show a profund lack
of
knowledge in
the
way
the
free
software community works. I am a tester for about 300 individual free
software projects. These range from the linux kernel, to Debian, to
Mozilla, to gcc. I use these things every day, subjecting them to the
worst torture I can, and report any and all bugs I find to their
authors. I'm not alone. Nearly every one of those 8684++ programmers I
calculated above are also testing a variety of other free software
projects in their day to day lives. So are uncountable regular users.
That's a whole ton more people than the 2666 testers you postulate who
work at microsoft. Moreover, the people who are testing free software
are testing it in production environments, where they throw an
unbelieivable variety of things at it.
For the reasons that the Mozilla project can track 6504 current bugs (as
of the time of writing), and Microsoft know of 23,000+ reasons why they
shouldn't
have shipped on Feb 17, no distribution I'm aware of has any idea how
many bugs are in the kernel, let alone the 1000+ packages they give you
on the same
install CD.
And what reasons are those, exacctly? You lost me.
Anyway, to put this in perspective,
the
debian
project is
currently
aware
of
13 thousand
bugs in that distribution. But that's just the Debian project --
they
don't keep track of every kernel bug because that's the job of the linux
kernel project, just as they don't keep track of all bugs in GNOME,
because that's the GNOME project's job. The GNOME folks are aware of 7.5 thousand bugs in
GNOME. I could probably continue this and hunt up enough bug lists until
we passed microsoft's 6504 bug count; I'm already 1/3 of the way there.
When you sell something, the consumer should expect and should
demand that that what
they've bought actually works. Knowingly shipping stuff with bugs
should be illegal
It is a truism that all software has
bugs. In
reality,
computer
scientists have techniques that can be used to prove that a program is
bug-free. Unfortunatly, these techinques do not scale at all to the size
and complexity of current software. While I wouldn't mind having the
entire software industry declared illegal, it seems a little unlikely..
It is truly amazing to me that Linux and all the packages that make
your average distribution are so stable. The stability is certainly not
due to testing.
To the contrary, Eric Raymond holds
that
"with
enough
eyes,
all
bugs are shallow", and the free software cominity would tend to
disagree with your statement and hold that testing, openness, and
adherance to good design principles is exactly why so much free software
has attained such high quality.
SQA tools will not
feature prominently in the competition winners, nor will they be
developed to the rarified heights available in the commercial world.
It's just not sexy enough.
C compilers are not sexy. C libraries are not sexy. Editors
aren't
sexy. MTA's and DNS servers arn't sexy. Kernels are arguably sexy or
boring, depending on the fetish of the person you ask. And yet free
software has produced leading products in all these fields. I hate to
quote Erik again, but free software development happens to scratch an
itch, not because it's sexy.
But in the case of many orpaned projects, there's no one
to
submit
a patch to, let alone get it fixed for the next person who needs the
same bug fixed.
It seems that the distributions have by necessity stepped
into
this
gap. Patches go to them, and if a project is orhaned, they stay there. I
do wish there was more cross-distribution communication of patches to
orphaned projects.
I cannot disagree with the concluding
paragraph of your article. I write regression tests -- and so should all
free software authors. But I find you have very few actual facts to
stand on in
the rest of it.
6504 -- whoops, posted 29 Mar 2000 at 06:35 UTC by joey »
(Master)
I somehow confused the article's bug counts for mozilla and microsoft.
Oops! My point still stands, I'm just 1/10th of the way there now. :-)
Not disagreeing, posted 29 Mar 2000 at 07:07 UTC by ajv »
(Master)
Joey,
I did mention in my article that we are all users of dogfood and
this
does work torwards the testing of the products we develop (ESR's shallow
bug principle), but it's not the same as doing formal testing before
shipping it out the door. The thing that still amazes me is the quality
of the stuff that gets plonked on to your average distribution. The
quality is simply not due to formal testing as it does not occur. It's
due to something else, and I think you missed my point, or maybe I
didn't express myself properly. (Probably the latter).
The development process in Redmond is well documented elsewhere
(Showstopper describes the process well, I feel) and by various MS types
that I meet during the course of the work I do for a living. At one
stage, there were competing teams to produce the next iteration of
certain products. That no longer happens, but the high developer-tester
ratio remains. I do not know exact developer numbers at Redmond, but I
would believe that the 5300 figure you quote as being too low; Windows
2000 alone would thus have more than half of the total number of
estimated developers. But the number issue is tangental and not the
point of my article.
I was aiming to show that no one really knows the bug count for an
entire distribution because no one is testing properly. So everytime
someone says that Linux is better because we don't have 63,000 bugs is
simply wrong, because no one knows what the bug count is. Linux
anecdotally is more stable, and seems to be of higher quality. but why?
It's not formal testing that got it there. If there's something that
we're doing right intrinsically, we should try to enshrine that. But the
software engineer in me says we're doing the wrong thing - according to
all my software engineering texts and best practice I've come across
during my security code reviews at various companies how OSS software is
developed should never work, and we should have the buggiest and least
stable environment. But it's not. That's what I'm getting at.
I understand that you believe you're testing out 230+ packages every
day. To me, that's not testing. That's using the dogfood and hoping for
the best. We all do this. I run development kernels when I can, I was a
beta tester of Win2K for more than three years. Of course, users find
bugs that developer inspired tests can't find because the developers
didn't think of doing X with package Y (which demonstrates product
depth, which is good). To me formal testing is doing "make test" and
seeing how many things work or fail, and comparing that to previous
runs. And fixing the problems that crop up. If a user or developer spots
a new bug, a new test is developed alongside the fix so that the bug is
not re-introduced later. The problem for pnm2ppa is that a full suite of
tests would include about 10 pages being printed on three varieties of
printer, two of which I do not own. Many of us are in the same boat.
Automated and easy testing for users would alleviate this problem - we
have many users willing to test our dogfood - if only they knew
everything that would help us track down those last few bugs.
When I wrote the sentence that starts "To me formal testing ..." I was
sort of hoping that I'd proof read stuff as formal testing can fill
several really boring books, and make test is not the entire
process.
I think I better understand now what you're trying to say.
I still stand by my numbers that there's no way microsoft can have
significantly more programmers than I calculated. And yes, I don't think
that having half their programmers working on windows is unreasonable --
with the nerging of IE and Windows, they have effectivly 2 main
products, Windows and Office, that likely get the loins share of the
programmers.
Anyone who says a linux distribution doesn't have 63,000 bugs is simply
wrong, period. It doesn't matter that we don't have a centralized bug
tracking system to make it easy to point to a list of those bugs -- but
I do believe that the people responsible for the various peices of
software in the distribution know about significant numbers of bugs in
their code. I see no reason to belive that they are less aware of the
bugs in their code than the microsoft programmer is aware of bugs in the
equivilant portion of code they contributed to Windows. The difference
is that keeping track of bugs in done in a distributed fashin here -- so
distributed that we can't get a full overview of it.
I think that the software engineering texts you refer to are going to be
rewritten pretty soon..
As for the regression testing, I understand that there's a lot of theory
behind it, things I do not know about, and am not qualified to
discuss. However, I can't belive that others in the free software
community don't understand this stuff -- they're a smart and well
educated bunch. I see what look to be comprehensive regression tests in
eg, perl (about 2.5 thousand individual tests, in fact). People
do run these tests.
In regards to your specific example of being unable to test everything
because you don't own the necessary hardware, I think there are definite
parallells with the linux kernel, which no one person is able to fully
test for similar reasons. The model that seems to work well for the
linux kernel is simply throwing a *lot* of people at the testing, and
assumming that statistically this will result in almost every possible
thing being tested.
Is there some way a formal SQA process can do this better?
dogfood?, posted 29 Mar 2000 at 08:52 UTC by fatjim »
(Journeyer)
ajv, would you mind explaining what you mean by 'dogfood'? Probably I'm
just being dense, but I don't quite understand.
dog food, posted 29 Mar 2000 at 09:11 UTC by ajv »
(Master)
Dog food ... as in "eat your own dog food". You use what you produce.
Our work is "dog food." A kernel hacker who uses development kernels
is eating their own dog food. Anyone who has developed code and then
uses it is doing the same.
See ESR's
Jargon file for a relevant reference. Having helped both MS and
Mozilla (I was porting Mozilla to Rhapsody, and then Apple stopped
helping me by killing Rhapsody/Intel, so I stopped helping Apple) during
the past, it's a term that sticks with me.
By example, there's a real old stable(ish) predecessor of pnm2ppa
that
just does B&W called pbm2ppa. I've never used it - I've always used
pnm2ppa - which is my dog food as I'm responsible for bits of it (the
buggy bits!).
Part of my article is the non-rhetorical question of what part of the
OSS process produces so many high quality projects with no obvious SQA
process? We must find that and distill it into every project, both OSS
and commercial. It is something that is being done right.
However, things that can help produce even better software are within
our grasp. In the same way we've come to expect autoconfigure, tarballs
that end up with a new directory, make clean, etc, I think there should
be a standard addition to the make files everywhere: make test. As you
point out - some programs like Perl and I can think of Mozilla and xtest
(getting old now) have test suites. These are the exceptions, and not
the rule.
Other things that would help: more tools like DejaGnu and an
appreciation that a lot of app development these days is going to be GUI
and I18N focussed and not ASCII text based, and is not linear, but event
driven.
SourceForge did more for SQA than any single tool improvement in the
last 12 months: it provided free bug tracking and CVS to thousands of
projects. For most projects the size of pnm2ppa (less than 30 files),
this is pretty much all they need. Some additional SQA help - they are
populating their documentation area, so I might use the feedback I get
from this to feed their site with useful stuff. I'm not just having a
moan here. I hope. :-)
A web site somewhere for SQA that discusses code quality (not just how
it looks like the GNU coding standards). This site may already exist,
and I'm just whining. So I'm going to stop. I was sort of hoping for a
discussion, and I'd like to see what happens.
Open Source Q&A, posted 29 Mar 2000 at 15:02 UTC by jennv »
(Journeyer)
This is just a short response:
Many - perhaps most - Open Source programmers also program
professionally.
Many - or most - of them use formal methods at work, or have used formal
methods.
Many live and breathe these skills, know them subconsciously.
I suspect that these programmers, at least, write code which complies
with formal methodology, even when they're not actually using formal
methods.
Oh, it might not be completely documented - especially those bits of
documentation which are just recording that the programmer followed
procedure XYZZY...
But all the important parts of the formal stuff probably got done.
Jenn V.
If there's something that we're doing right intrinsically, we should
try to
enshrine that. But the software engineer in me says we're doing the
wrong thing - according to all my software engineering texts and best
practice I've come across during my security code reviews at various
companies how OSS software is developed should never work, and we should
have the buggiest and least stable environment. But it's not. That's
what I'm getting at.
We are doing a lot of things right intristically.
Below are some of the (to me) most obvious factors, in no prioritized
order. They probably aren't the complete picture, but they at least
give a sketch.
'We' below refer to the Open Source developers in the established Open
Source community.
- The developers are usually not doing development to a deadline or
to solve a business problem.
Most Open Source development is done out of love. This result in
programmer satisfaction being the most important aspect of the
development, and most developers like writing good software, and do not
like having bugs in their software. This tend to influence development
priorities.
- The most recent code is actually in production and
pseudo-production use.
Open Source (when developed by the bazaar model) is associated with
frequent releases, both of stable and development branches. This result
in rapid feedback about changes, and in pressure to keep the code
functional - a project that is on its way into a swamp full of bugs will
get branched from the last stable version.
- We make small stuff.
Just about all open source projects effectively build small stuff.
We might build large stuff by building a lot of small stuff after each
other, but each piece is small stuff, because we do not have the
infrastructure to try to build large stuff up front.
- We automatically try for the simplest solution to a problem.
An Open Source codebase tends to move towards maximum usability. If a
simple change can make the system more usable (for a very generic
definition of user), that change tend to be done, unless there are
complelling reasons not to. This is a clear opposition to commercially
sold software, where the driving force is to make the code more
sellable, not more usable.
In the move towards maximum usability, we have a lot of programmers
looking at most of our software. If one of them see a simple way to
improve the software for her own use, she will do the change. This
automatically select simple solutions, because the simple solutions will
be available before the complex solution.
- We work with patches.
In most other environments, programmers will just work directly against
the source code, and somehow commit the code for the
production/development version once he's done his changes. Any change
handling is usually done by a version control system.
Open Source developers, on the other hand, tend to get submissions as
patches, and thus tend to have a conscious view of the changes
themselves (on a line by line basis) rather than just the end product.
This creates a very effective review and feedback loop, even for the
programmer that is working on a codebase alone.
- We know how to read source code.
Most Open Source projects do not have a bunch of UML diagrams, or an
available architect to describe in detail how things work and where to
find them each time a neophyte wonders about something. This forces
the neophyte to learn to read code well enough to find out many of these
things for himself; this again gives feedback towards being good at
understanding code, with the side effect of becoming better at spotting
when code doesn't work and at reading patches.
- We stick with a common set of basic tools.
diff. patch. Your basic text editor. Your basic shell. The C
library. The Unix system interface. Root as the basic owner of all
rights. make. Devices in /dev. X windows. The basic Unix commands.
The languages we use.
All of these has history going back over 10 years; many of them has
stayed more or less constant for 15 to 30 years. With this, we have a
common, simple backdrop to do our development in. We do not have to
re-learn our tools every two to three years. This leads to a second
backdrop: A lot of people with vast experience in the tools and in
development, controlling part of the culture of open source.
- We practice program/system evolution.
A really bad open source system dies from lack of users. A really bad
commercial system has more programmers and more money pushed into it if
that can result in more income later.
- We are 'Best Of Breed' programmers
This might seem a bit snotty, but Open Source attracts developers that
see development as fun, not just as a way of making a buck. This is an
effective filter for getting people that are really interested in
development, and thus tend to be good at it.
The rather harsh environment of the Internet also tend to force some of
the qualities good developers should have - e.g, participation in
discussion forums tend to force people to distinguish between what they
know and what their opinions are.
- Through the peer review, we end up with rapid feedback on how we
write bad code.
All developers have some bad habits. Open Source programmers usually
start with submitting patches to existing projects; this forces them
through a layer of peer review, which pick away some of the bad habits.
Out comes a better programmer, about to start writing more Open Source
software.
- We have access to a set of good programmers to discuss ideas with
that outstrip any commercial entity in the world.
As argued above, Open Source programmers tend to be above average. For
just about any significant project, a lot of them are available to get
feedback on design issues, and to production test any runnable code.
There are way more qualified programmer time available than for most
commercial projects.
As an example: Some time ago, I started a low-key project to create a
new version control system (to replace CVS and other horrors.) As I
didn't (and don't) want to hype it until runnable code was (is)
available, I only contacted a few friends of mine to tell about this,
inviting them to join the design/review team. Within 24 hours, I had
over 100 years of programming experience on the team, collected within
approx 7 people. Getting allocated time from a comparably sized group
with over a 100 years of accumulated experience would be close
impossible in a commercial setting.
- We have a culture of mass communication.
The Open Source community routinely use mass communication tools.
Usenet, mailing lists, Slashdot, IRC, FTP, CVS, diffs. All of these are
designed or have been tweaked to function as communication tools between
large groups of people. This allow much easier use of many pairs of
eyeballs than the corresponding tools that commercial developers tend to
use.
- For codebases that have seen widespread use, nobody can force bad
changes on the codebase.
This isn't quite true, but for bad changes to be accepted (even from a
maintainer), they have to be heavily outweighted by good changes. If
not, the codebase will be branched, either locally or globally.
- Authors tend to stick around.
In a commerical setting, a developer is no longer available when she
changes jobs. In an Open Source setting, the original developer of a
piece of code is usually available for quite a few years after the
initial development, and if she disappears, somebody else will usually
learn the code and start acting as a new "Source Of All Knowledge" (for
the tricky parts) - unless there is a group of people already doing that
through one of our mass communication channels.
- Development tend to go slower.
Open Source developers usually push fewer hours per day into the
codebases than commercial developers. This result in more time to think
about each change, and more time to get feedback. If a better way to do
a change is proposed, the developer will often take the time to do it
"The Right Way", even if the original way of doing it is adequate.
- Open Source is better because Open Source is better.
This is self-referential, but I believe it is true: The quality of open
source tends to
attract developers that like quality, thus getting a feedback loop where
the developers that use and develop Open Source likes to make more
quality software.
Feedback is very welcome; I'd like to develop this into something that
could be posted as a front page article.
Eivind.
I'm not convincd we need formal SQA. It seems to me that the Monte
Carlo method we use now catches at least as many bugs as the formal
processes used in Redmond, costs a lot less, and generates an end
product of comparable quality.
Part of this may be the fact that free software isn't tied to a
release
schedule set by marketing. When we released GIMP 1.0, we decided that
the release wouldn't go out the door with known bugs. We made a point
to close every known bug before release, and we did. There were, of
course, bugs, but they were pretty obscure -- stuff that might very well
not have been caught even with a formal SQA system. And there were in
fact developers who were doing exhaustive testing of various
components. Commercial products often roll with known bugs because the
marketing department has decreed that the product will be released on
<date>, bugs or no bugs.
What I guess I question is the unspoken assumption in this
article
that
formal SQA methods will insure a higher quality product than the current
"many eyes" method in common use.
Some brief comments on Evind's post.
Two primary causes of bugs creeping into code, and a reason for not getting them out again:
First, features added for business or marketing reasons, not technical reasons.
When I was first a technical product manger,
the president of the 40-person company where I worked came to me days before a release and requested a feature.
I replied that it would be very good for version two, and would be please make a note of it. After he left, several people
mentioned to me that I was the first person ever to refuse to add a feature to him. The shoftware shipped on time, and
I won't say it was bug-free, but at least we didn't have last-minute changes in functionality.
The pressure to add features can be very strong, and it's harder to resist than it sounds.
It's the business environment of the software.
The second cause is the technical environemnt, where there can be uncontrolled or (worse)
uncodumented
interactions between components. This affects Microsoft Windows a lot, and Unix (including Linux) relatively little.
In Windows, programs, libraries and even data files interact through the registry as well as through memory access.
Unix programs tend to be independent, so that if one crashes, you can restart it and watch it crash again as many times
as you like without affecting other programs particularly. That's not always true, but it tends to be.
My notebook crashed yesterday when I tried to uninstall QuickTime so that I could look at PNG files in
Netscape again; what are the chances that looking at a png file in IE5 will now bring up netscape? A bug in IE can affect
Windows Explorer; an IE plugin can affect your ability to list files - this is lunacy.
Third... In a business environment hostile to software robustness, you need a supportive technical environment.
In a hostile technical environment, you need experienced bush-fighters. That brings me to the third difficulty:
the tendency to hire junior programmers for testing. I can't actually say whether Microsoft does that, although I'd
bet my socks on it that they do, because it's standard industry practice. And it would work fine if the business and technical
environments were supportive: in that case, the rĂ´le of the testing environment would be less critical.
Of course, any testing at all is a Good Thing, and all too often ignored. But if a new feature is added two days before
shipping, you're clearly not going to have a regression test, and you're not going to have much of a test plan for it.
This problem affects everyone writing software, but especially in a commercial product environment.
Henry Spencer had a .signature years ago saying that Sun used a 32-bit architecture so they could have 32-bit
bug numbers. 63,000 issues is a pretty small number. But the number of bugs is irrelevant, frankly. What matters is
how often the software falls over, in its deployment environment, or where it's used (I hate jargon too, honest!).
Sorry, I said I'd be brief, and like George Bernard Shaw I don't have time to write something shorter.
I agree with a central point of the original article: free software
isn't tested as thoroughly as commercial software. I think the reasons
are fairly simple - testing isn't much fun, it's a lot of work, no one
knows how to do it well. Happily, open source software has its own ways
of dealing with bugs, and the code we produce is good. I think the main
reason free software is good software is that it's a "labour of love" -
we write free software because
we want to, and therefore we're inclined to make it beautiful, good,
strong.
That being said, there are ways to preserve the fun of free
software
while also doing some serious testing. The
Extreme
Programming community has developed an interesting engineering
methodology based around incremental coding and creating large test
suites. Some of their ideas seem to require an awful lot of discipline,
but there's a lot of fun in it, too.
Numbers, posted 29 Mar 2000 at 19:46 UTC by neo »
(Master)
The GNOME folks are aware of 7.5 thousand bugs in GNOME.
I might sound a little picky here, but I want to clarify this. About
8000 bugs have
been reported to the GNOME bug-tracking system so far. That doesn't mean
that
there
are 8000 open bugs. I'm not willing to download the whole list and count
them, but from the experience I have with The GIMP (which uses the
gnome bug-tracker), I know that a lot of bugs are closed shortly after
being reported. This is either because they are bogus or simple
misunderstandings or because developers care about them and commit the
necessary fixes.
Extrapolating from the ratio of fixed/open bugs in The GIMP to the whole
GNOME project, the number of open bugs has to be corrected to somewhere
below 2000. Taking into account that the bug-tracker is used by a large
number of applications, 65000 bugs in Win2K looks like an awful lot of
bugs to me.
Re: Numbers, posted 30 Mar 2000 at 02:29 UTC by joey »
(Master)
Re: Numbers, posted 30 Mar 2000 at 02:33 UTC by joey »
(Master)
(Oy, why does Advocadro allow postings with no body?)
You're right, the correct number is 4400. Seems GNOME has closed an
impressive number of bugs in the past month, which I accidentially
counted
In a previous comment (I wish
that Advogato would allow me to
specify HREF="#9"), ajv wrote:
I think there should be a
standard addition to the make files everywhere: make test.
and mentions the GNU Coding Standards a few lines below.
Well, as a matter of fact the GNU Coding Standards
do mention the standard addition to the makefiles, except that it is
called make check and not make test (one reason for
using "check" might be that "test" is a valid target in some
makefiles). Here is an
excerpt from the section Standard Targets for
Users in the chapter titled "Makefile Conventions":
`check'
Perform self-tests (if any). The user must build the
program before running the tests, but need not install the program; you
should write the self-tests so that
they work when the program is built but not installed.
And a few lines below:
installcheck
Perform installation tests (if any). The user must
build and install the program before running the tests. You should not
assume that `$(bindir)' is in the
search path.
So the GNU Coding Standards have been recommending automated tests
for a long time. The standard utilities automake and
autoconf support the insertion of these targets in the
makefiles. And a large number of GNU tools do something when you run
make check in the source tree. Maybe some of these
tests are
limited in scope but many tools are able to perform some self-tests on
their basic functionality.
Summary, posted 31 Mar 2000 at 07:57 UTC by ajv »
(Master)
Okay, I learnt a lot, and it helped shape my view that we still need
better testing tools. Some of the replies were really top notch - I
appreciate joey's, elvind's, kelly's, Raphael's and Ankh's replies.
Kelly and I will have to agree to disagree. I think anyone can learn the
SQA basics, and then apply them throughout the development process.
Here are the reasons that OSS code is better quality than
commercial,
closed code:
- OSS development is rarely led by marketing, focus groups, etc
- OSS development is rarely guided by fake or arbitary deadlines
- OSS development is done by people who care about their code because
they do it for the joy of doing it, not because they are paid (but some
are)
- OSS code is open to peer review, and crap code is quickly beaten out
of the project, particularly in the big ticket projects.
I should have read the coding standard that I cited more
completely
instead of getting stuck at the thou shalt indentation bits. make
check is already a standard. I'll implement make check into pnm2ppa
and I'll try and convince the reiserfs guys to do the same - I don't
think they'll object somehow.
I got an e-mail from a MS Win2k program manager, and it turns
out
that I
had *under-reported* the developer:tester ratio on Win2K. It's actually
1:1, and according to the program manager, this is the first project
with that ratio. Hopefully, he can post here to clarify; I'm sure we can
learn something from that.
Finally, I think that people might have taken my headline
grabbing
subject line and gone too far down that track. The problem is not the
number of bugs, but the innate development process that causes the
severity of the bugs in OSS projects to be so much lower than in
commercial code bases. Hopefully, people can take this on board, and use
it in their projects.
Thanks to everyone, now back to your regularly scheduled
programming
session. </ul>
I'm going to disagree with this assertion. I've seen a lot of bad code
in quite high-profile projects.
However, bad habits tend to be knocked out. Even if we don't get
all the code (even for the major projects) seriously reviewed, open
source programmers that work as part of a team get large parts of their
code reviewed, and thus get negative feedback on bad habits.
Eivind.
Please note that I did not say that I don't think we need SQA tools. I
said that I am not convinced that we need them. You simply
haven't sold me on the proposition that formal tools would do a better
job than our current system. The extreme lack of quality in Microsoft's
products, if anything, argues that formal SQA methods are not likely to
be useful in free software products. If the best Microsoft can do with
formal SQA and all of its staff and financial resources is Windows, then
those methods would be utterly hopeless in free software development,
which has a fraction of the staff and even less money.
I remain open, but skeptical, to the possibility that you (or someone
else) will prove me wrong and show that formal SQA methods will result
in a better product than the current peer-review process, ceteris
paribus. But, to be honest, I don't think you can impose formal SQA
methods on free software development without either providing
substantially more developers (who are going to come from where?) or
substantially limiting productivity. We already have enough of a
problem with unfinished projects.
When I started writing dribble, I saw a need for some sort of tool to
find the most weirdo bugs. Since my aim is fairly narrow (a library with
a limited number of calls, not that much semantics, etc...), I went for
a stochastic method. I even found a showstopper or three that way...
For now, I try to excercise as much of the API I can, in a more or less
random fashion (I can select the seed, so I can easily re-run a test
"after the fact") and I log every operation done, so I can check the end
state with what it *should* be, given the sequence of operations.
Now, most people are more ambitious than I am, so I guess a "simple"
test tool won't be of much use and my experience is that writing good
test tools is harder than writing the system they test.
ajv says:
When you sell something, the
consumer should expect and should demand that that what they've
bought actually works. Knowingly
shipping stuff with bugs should be illegal, and every customer
should be entitled to free bug fixes (no
service contracts, and no identity farming) for as long as the
customer uses the product (ie not when
buggy version+1 comes out). This should be law, not
UCITA.
While I agree with the sentiment, the implementation would be
problematic. Rather than making it against the law to ship software
with known bugs, it would be better to make it a law that all known bugs
must be clearly identified and explained in a document shipped with the
product and listed on a website specifically for that product and its
bugs. And all bugs that are found after ship date must also be listed
on that web site as they are found.
I agree whole-heartedly that customers should be entitled to
free
bug
fixes. Especially for any product they've spent their hard-earned cash
on.