The current product kernel is 2.2.18, and for a long time the
development kernels were the 2.3 series. The 2.4.0-test1 kernel came
out earlier in the year and not only provides many new features to the
Linux system, it is also a major rearchitecting of the kernel.
I've been working with the 2.4.0-test and 2.4.0-prerelease kernels for
most of this year and found that in general they work pretty well. I
think I can safely say that they will be fine when used by someone who
is a programmer or competent to administer their own Linux system. This
is not to say that the kernel is yet trouble-free but is good enough to
be worth
using by anyone likely to be reading Advogato.
The problem is (and the reason that I post this) is that once 2.4.0 is
released, it is likely to be rushed into production use on a lot of
end-user systems, many with configurations that have not been adequately
tested. I'm hoping that more widespread testing will head off such
problems.
This is in part because the users will download the sources and
build the kernel themselves because it has features or fixes they need,
or because many of the distributions will rush to include it so they can
be perceived as "competitive", either with each other or with non-free
operating
systems.
But the people doing the most active work with the 2.4.0 kernel are the
kernel developers themselves, or those few like me who are just working
to test it. I don't think there's a tremendous number of people taking
the
trouble to test it, and even those who spend the most time at it (the
kernel
developers) often have limited resources for trying different
configurations. (I have heard of distributions that prematurely shipped
systems with prerelease kernels - something I consider
irresponsible.)
A lot of people will have their very first experience with Linux by
purchasing a
$29 CD distribution "just to check it out". For many of them, the
brand-new
2.4.0 kernel will be what they get, and it's very important that they
have a positive
experience with it. Every bug found by an Advogato reader is a bug
that's not found by
a couple of thousand novice Linux users who might not come back for
more.
It's very important to have the kernel tested on a wide variety of
configurations and under the load of a lot of different applications.
For comparison of what's done in the commercial world, I used to work
at Apple, at one time as a QA engineer and at another time as an OS
engineer doing system debugging. I tested MacTCP, Apple's older TCP/IP
stack, and for that I had about three dozen machines in a lab and worked
full-time there about a year doing nothing but testing, writing test
plans and writing
test tools - all that to QA what was then (1990) considered an
unimportant component
of the system by most of the company.
At the time I
was an OS engineer in the mid-90's, I don't know how many QA staff Apple
had, but I would guess it numbered 500 or greater, all working
full-time, year round to test a system that had far fewer hardware
configurations in question than the Linux kernel is expected to
support - and note that Apple maintains extremely tight control over the
hardware,
where Linux is expected to run on just about anything from Internet
Appliances to ancient 386 boxes to mainframes.
The kernel has special needs for testing that require it to be done by a
wide variety of people for several reasons:
- It is distributed as highly configurable source code, so it needs to
be tried out with lots of different options to try to find combinations
that stimulate bugs
- It supports a number of different instruction set architectures, and
different CPU grades for a given architecture, and even mainframe
processors (the S/390) - no one owns all those different machines
- It supports a very large number of hardware devices, which need to
be actually installed to do anything interesting. There may be
conflicts between different devices that can only be found out by
widespread testing
- Being an interface between user applications and the hardware, the
kernel
needs to be tested by running lots of different user-mode programs on
it, so that
a lot of combinations of system calls and other loads on the system get
tested - that's why you should test your application on the new
kernel
- Failure of the kernel in a production system usually has a worse
impact than failure of a user mode program
If the kernel is flaky, it's obvious your machine can crash and the
filesystem
can get corrupted and users
lose data and the use of their machine either until they reboot or even
until the
problem is resolved. What could be worse is if a buggy kernel doesn't
crash but causes incorrect functioning of an otherwise reliable program
-
this kind of bug is insidious and can be maddening to track down.
There's a few things you'll need to know to get working with your new
kernel.
Usually you want to report bugs to the linux-kernel mailing list at linux-kernel@vger.kernel.org
Note the new mailserver - vger.rutgers.edu apparently had a meltdown.
You probably don't want to actually subscribe to the linux-kernel list
because of the volume of mail. I suggest reading the list off of an
archive, of which there are many. I like
this archive. You
can find other archives at Google.
It is of course good form to read the
linux-kernel mailing list FAQ.
Once you're connected to an archive server the files to look for will be
in pub/linux/kernel/v2.4
You'll only need to download the whole kernel source once, then you can
download and apply the much smaller patches when they come out (you
don't have to try to keep up with all the patches, contribute at a pace
that's appropriate for you).
If you download the .bz2 files (which are smaller), use bunzip2 to
unpack them, then tar -xvfp to extract them. If you download the
.tar.gz files, use tar -xvfzp to extract and uncompress them at the same
time (I download the bz2 files to save time, then recompress them with
gzip to save space on my machine and use -xvfzp whenever I need to
extract them, saving space on my machine).
When you untar the sources, a directory called "linux" will be created.
I won't go into how to configure and build the kernel, for that the Kernel Newbies website has the
best information.
The one big gotcha I ever found was that to change from running a 2.2
kernel to a 2.4 kernel I needed a new set of modutils, the programs that
manage the kernel modules. Without them you'll get a lot of undefined
symbols in your modules and your modules won't load right (the new
modutils
seem to work OK with old kernels). You'll find
the new modutils on your local mirror server in
pub/linux/utils/kernel/modutils/v2.4
If you want to contribute the most, try to download and apply the
patches that come out. If you have a specific problem, and someone
posts a patch for it on the mailing list, you can grab the patch out of
the email and apply it, or you can get the compilations that are
distributed by Linus or Alan Cox, which contain all of the submitted
patches that they've approved of - note that sometimes if a patch that
fixes something gets submitted, it doesn't always get included in the
new compilations, and you need to politely remind the kernel developers
that the problem remains and maybe resubmit the patch.
If you've got a patch named patch-2.4-prerelease-ac5 then you
apply it to the 2.4.0-prerelease kernel sources by cd'ing into the linux
directory (kernel source top level) and executing:
patch -p1 < patch-2.4-prerelease-ac5
Note that patch takes its input from standard input rather than as a
command line parameter - don't forget to redirect with <
Linus' patch compilations will be in
pub/linux/kernel/testing. Alan Cox's will be in
pub/linux/kernel/people/alan/2.4.0test/ Generally
Linus' stuff is more official and stable while Alan patches are often
the first try at something or experimenting with a fix.
A few more helpful tidbits:
After you configure your kernel a file named .config will be created in
the linux directory. This holds all the configuration options you just
selected. It is helpful to make a directory somewhere and save copies
of your .configs with names that reflect the kernel version and the most
significant options you've set in that build. You can then use the
saved files to recover earlier kernels for testing at a later date, and
if the kernel developers need it you can send the .config file for a
kernel that had some bug you're reporting.
If you patch your kernel sources you can get them configured anew the
fastest if you use an old .config file and give the command "make
oldconfig". You'll be prompted for new items that weren't mentioned in
the old file. It's probably best to run through the whole configuration
manually when you first create a 2.4 kernel though, as there is a lot of
new stuff.
If you have XWindows working on your machine, the most pleasant way to
configure your kernel is "make xconfig" (saying this probably marks me
as not being a true hacker...). This is also the quickest if
you want to just change a few options here and there (it's a GUI
configuration tool) or for browsing the config options help. Other
possibilities are "make config", "make menuconfig" (for curses-based
editing in a terminal), or just manually editing the .config file (not
generally recommended because of dependencies between the options).
Finally, if you're working on an Intel-architecture machine, and are
trying out frequent new
kernels, it is very convenient to install GNU Grub. It is a much
more full-featured bootloader than LILO. Chief among its advantages is
that it understands various filesystem formats natively, so unlike Lilo
which needs to be reinstalled every time a new kernel binary is put in
place, once Grub is installed you only need to edit it's menu.lst file
if you want to add a totally new kernel name to boot off of in the boot
menu - and if
you forget, you can boot the kernel by name from the grub command
line.
Because it boots the
kernel by name rather than physical disk sector, replacing an old kernel
with a new one with the same pathname doesn't require you to do anything
at all to grub - because LILO uses a sector list, a new kernel with the
same path may be in a different physical location on the disk so you
have to reinstall it when you put a new kernel in place.
Note that Grub is not yet at 1.0; it works great for
me but I suggest starting by making a grub floppy for testing before you
install it in your boot sector.
I've got a few things to add, mostly in response to email that has been
sent to me by folks who've read the article above. Eventually I'll
rewrite and organize everythng a little better and post it it in the articles section
(nothing there so far) of the Linux Quality Database (so
far just a proposal - wanna help?)
The feedback indicates this is getting read by a lot of people who
aren't
programmers but do want to help test the kernel, so in some of what
follows
I give background on some things that should be pretty familiar to most
Advogato
members.
How Good is the 2.4.0 Kernel Right Now? Should I Feel
Safe to Test It?
One fellow wrote in to recommend that I should say that the new kernel
works "very well" or at least "well". He felt that my statement that it
worked "pretty well" would discourage a lot of people who might
otherwise usefully test it.
It is my own experience that I have very little problem with the new
kernel, and very likely you won't either. But I hesitate to say
anything of substance about how well it's working - if it works at all
for you, very likely it will work flawlessly and you'll have the added
benefit of whizzy new features and performance enhancements.
But observing the traffic on the linux-kernel mailing list, some people
have significant trouble. I feel that if you test it, the benefit will
be likely you'll have a nice new toy to play with, but you must accept
some risk, and that risk might be that your machine won't boot at the
very least - or that it will scrag your filesystem or lose data you've
created in a program. So it really should only be tested by
people that are prepared to accept the possibility of having to fix
their machine or recover their data.
Let me contrast this, however, with the condition of Windows 2000 when
it was beta tested. I needed to write some Java meant to run on NT for
a consulting job and my client thought it would be fun if I used the
Windows 2000 Beta. I would suggest "living hell" is a better way to
characterize my experience. I had no end of trouble, and it wreaked
lots of havoc
with my work - for example, I could not use ethernet and DNS via PPP at
the same
time (even though I ran Windows 2000 server) and had to disable ethernet
and reboot before checking my email.
I understand that The Win2K Problem shipped with 64,000 documented bugs
of
which 25,000 were considered "serious" by Microsoft, and the opinion was
widely held among the industry press and IT managers that one should not
install it until a few service packs had been released - but Microsoft
shipped it anyway. (To be fair, all those bugs were counted among the
entire system and not just the kernel).
I've been running the 2.4.0-test kernels on the machines I use for my
daily work since test1 was released. I've had no problems that
prevented me from doing useful work. The one serious bug I found was
that my Adaptec APA1480 Cardbus SCSI host bus adapter wouldn't function,
and
that was resolved very early on by working with the mailing list - so
now I can burn CD's with a SCSI CD burner off
my laptop. The only
problem I've got now is that my machine doesn't power itself off when I
shut down.
So you be the judge.
Besides Building the Kernel, What Steps Do the Users of a Given
Distribution Need to Take to Run the New Kernel?
As far as I know, the only thing that is absolutely required is to
install the new modutils package as mentioned above. The modutils are
user programs that manage kernel modules, generally device drivers that
may be loaded into or removed from the kernel at runtime. The module
format has changed in 2.4, so that's why you need the new version.
All of your existing user-mode programs, applications and libraries
should continue to work without the need to update their source or even
recompile. Binary compatibility with user programs that ran on old
kernel versions is a basic requirement for the system.
I have seen reports that some existing app would crash when run under
the new kernel. This isn't an error on the user's part, usually, but a
bug in the kernel, and should be reported to the mailing list.
There are some new kernel features that require user programs to take
advantage of them. You don't need them to run the new kernel on your
old system. I don't know what they all are, but they are mentioned in
the kernel config help - if you select the help when examining a
configuration option, sometimes the help will refer you to other
documentation or to a website that will tell you about the new software
you need.
I know one feature that is probably too radical for most casual users to
want to mess with on an existing distribution install. This is the
DevFS filesystem. With DevFS, the /dev directory is initially empty and
special files are created when a
driver loads (either at boot time or when its module is loaded) and it
disappears when it's module is unloaded.
This is a vast architectural improvement but you probably don't want to
just slap it into an existing distro that expects its /dev files to stay
put, and there are some issues about managing these files that need to
be dealt with (like how to set the default permissions on one of these
dynamic files). Anybody but the
Linux From Scratch people will probably want to wait for a distro
that supports that as an integrated whole.
Monkeywrenching the Virtual Machine
I'd like to say a few additional words about why it is so important that
the quality of the kernel, not just for Linux but any operating system,
must be so high. One could argue that it's just as critical that the
system
libraries be error free because an error in a library could affect any
program
that uses it, but really the kernel is a special case.
This is because of the non-local effects of having the virtual machine
break down.
Reliably functioning computer programs, both kernels and user-mode
programs, are virtual machines, of which the parts are the data
structures and the algorithms which operate on them. We have stacks,
queues, lists, subroutines, interrupts (both hardware interrupts in the
kernel and software interrupts in use programs such as signals),
threads, locks and so on.
Our programming languages, libraries and kernels give us a wide array of
machine parts and then we assemble these into very elaborate machines
that, if rendered as physical mechanisms, would put the finest sportscar
to shame - as long as the programs are written correctly.
The problem is if you've got certain kinds of bugs in your program, such
as heap corruption, buffer overflows, race conditions, failure to
protect a critical region, then all hell brakes loose. It's as if the
Army pulled a Howitzer up to your nice sportscar and put a shell through
the engine - but then it kept running. Programs don't explode when
they're damaged, they're happy to continue running along, executing each
instruction in sequence, but they're likely not doing what you want.
Consider yourself lucky if you get a segment violation - at least then
you find out right away something is wrong, rather than an hour later
after you've saved your work to disk into a file that turns out to be
corrupt.
I discussed this in a letter entitled Algorithms have
unclear boundaries that I originally wrote to the patent office and
also submitted to the Forum on
Risks to the Public in Computers and Related Systems. (I recommend
that anyone who uses computers read Risks - years of following
the
Risks Forum is what made me such a freak about software quality).
I once followed a discussion of programming assertions on the Usenet
News.
Assertions are tests included in debug builds of programs that test that
a condition
that must be true actually is true. If the condition is found to
be false
then the program is halted immediately so the programmer can check out
what's wrong. Assertions speed software development by catching your
mistakes quicker, doing
some testing automatically for you every time you run the program.
One common practice is to test that an impossible condition is not true,
for example,
if a variable is allowed to hold one of three values then you assert
that it does not contain
a fourth. But one participant in the discussion argued vehemently that
if he could
prove, through the logical flow of the program code as written, that an
impossible
condition could never occur, it was a waste of time to include
assertions that tested
for impossibilities.
I feel that he was wrong though, and it's likely he spends a lot of
extra time needlessly debugging his programs that he could save by using
more assertions. His
argument only holds while the virtual machine is intact. When the
virtual machine breaks down, impossible conditions start coming fast and
hard, and peppering your code with assertions will warn you right away
this is happening. It's impossible to know ahead of time what
impossible conditions to test for, in practice you test for them
wherever its convenient.
Now how does this long theoretical discussion apply to the kernel?
Normal user mode programs on modern operating systems like Linux run in
protected memory, in which the program has the perception it
possesses the
entire memory space of the whole machine and it is impossible for one
program to use a memory access to affect another. The protected memory
is managed by the kernel and
enforced by the memory management unit, a component of modern
microprocessors.
If the virtual machine of one user mode program breaks down, it may act
erratically or be terminated by the system, but it is unlikely that it
will harm any other programs.
Besides keeping the system more reliable for users and protecting user
data, protected memory makes life easier for programmers because an
error in your program will at worst terminate the application. You find
out right away something is wrong, if you're using a debugger you get
helpful information on what the problem is, and your program doesn't
crash the machine so you don't have to wait to reboot to continue your
work.
Don't take protected memory for granted - there are lots of systems that
still don't have it. The classic Mac OS doesn't, and I've spent much
time in my career waiting for a Mac to restart because of some silly
pointer bug. The BSD/Mach-based Mac OS X that is currently in beta
testing will be Apple's first publicly released, widely used
protected memory OS (there was also A/UX, an early Mac Unix, but it
wasn't
meant for widespread consumption).
User mode programs on Linux can affect each other, but they do it
through carefully managed channels of communication that are directed by
the kernel. Most familiar are TCP/IP networking and files on the hard
drive, but there's also Unix domain sockets, pipes and signals.
Programs can expose the guts of their memory to direct access by
other programs by using shared memory via such methods as the mmap
system call, but they only do this when they want to and typically they
do not expose critical data.
These are all well-defined communications pathways. It is
possible for one program to crash another through one of these pathways
(for example, by writing a
corrupt file to disk that is used by another program) but it is much
harder in general and even then the problem is localized.
The kernel is a special case, though. In itself, it is a particularly
complex virtual
machine - both within its own operation, and in the system call and
special device file interface it presents to user programs - it presents
the hardware to the user programs as an external virtual machine. It
sits in the middle of everything, between each user program and the
hardware, between different pieces of hardware that communicate with
each other via hardware buses and DMA, and between user programs running
together
on the same machine and even on different machines that are
communicating via a network protocol.
The kernel effectively has root privelige on your machine. If a program
has lesser privelige, that is because the kernel is enforcing that
policy - but in reality, the kernel can do anything it wants if it
should get an inclination to.
It all runs in one big virtual machine. The kernel does not have
protected memory within itself. The situation is complicated because
parts of the kernel run within the virtual memory space of user
programs, and the kernel manages the memory spaces itself, and also
makes direct access to physical memory, so the memory architecture of
the running kernel is a complicated thing. But there's really no
protection against some part of the kernel screwing up another part.
And if the kernel's virtual machine breaks down, just a little bit, not
so much as to bring your machine crashing down, you can create
pathological communications pathways within the kernel.
An extreme case (I haven't seen this actually happen) would be a pointer
bug in a device driver that caused the driver to overwrite some critical
memory data structure that was used by a journaled filesystem like
ReiserFS. Lots of people think journaled filesystems are completely
reliable because they arrange to write filesystem metadata only
atomically. First the metadata is streamed into the journal, and only
after it is complete is it then copied to the filesystem itself, and it
is done in such a way that if the process is interrupted at any time (as
by a power failure) then the integrity of the filesystem will be
preserved.
But what if a buggy driver scrawls some bogus data into the memory used
by the journaled filesystem just before it's written to disk? Think
about that the next time you install the driver for some oddball
piece of hardware into the computer you're using to write your
memoirs.
Something I have seen happen many times, when I was a "Debug
Meister" at that Big Fruit Company in Cupertino, is for an error in the
operating system (the Mac OS System in this case) to screw up data
structures used by some other part of the system during some system
call.
When a user application later makes that system call, something else
happens
other than
was documented by Inside Macintosh - the system behaves incorrectly, or
returns bogus results.
The most straightforward and methodical way to test this is by writing
test tools that try out all the different system calls, and vary their
parameters over the acceptable ranges and ensure that the results
returned are also within the documented range. You also try making
system calls with illegal parameters to ensure that an appropriate error
code is returned.
This is valuable, but the tools are tedious to write and often don't
exercise the system all that well. I don't see a lot of these kind of
tools available in the Free Software community but it would be valuable
to write some (that's part of what I did as a QA engineer at Apple).
What is also very valuable is to stimulate the kernel with many
applications that are otherwise expected to work reliably, because they
have worked reliably with previous kernels. There are far more programs
meant for some real purpose than there are test tools and so using these
you can get much broader coverage than a test tool would typically do.
They're usually more interesting to spend your days with too.
You want to try out these applications on lots of different hardware
configurations because of the problems of hardware-dependent code
creating pathological communications pathways with the programs. And in
fact at Apple it was very common that a tester would report that some
commercial application would work reliably on one model of Macintosh
with a new version of the System, but not another, and often this was
because of some bug in a hardware driver that surfaced in the
misbehavior of a video game or spreadsheet.
At this point I've probably scared you beyond wanting to test at all.
But the situation is not as grim as it might sound. The kernel wouldn't
work very well at all if it was not highly reliable to start with, and
there are some things about the kernel and the way it is developed that
make it much more robust than is likely to be the case with other
operating system kernels.
One factor that adds to Linux' reliability is that it is
cross-platform. It supports a number of different microprocessors as
well as the S/390 mainframe processor. It is used on a very
inhomogeneous population.
Another is that it is distributed as configurable source code. There
are widely varying options for some ways the kernel will work, and even
with one set of features for a given architecture you can choose to
optimize for a particular processor.
These are good news because they help to bring out latent bugs. Some
bugs only cause trouble rarely, or don't show up at all but rear their
head after a major modification to the system. But since the kernel is
distributed as source code, and built for many different systems, it is
likely that the different conditions of one system - often the fact that
memory is laid out differently, or that the code is built with different
options - will stimulate the bug repeatibly on at least configuration so
it can be found and fixed early.
Contrast this with, say, the Windows 2000 kernel, which only works on
Intel-architecture microprocessors, all of which run code copied from a
single build of the system by Microsoft's release engineers. This is a
very homogeneous population and they do not have the benefit that
varying so many parameters brings to Linux. Note also that when Be, Inc. ported the BeOS to the Intel architecture from
PowerPC, although they found that there was vastly more market interest
in Pentium BeOS than PowerPC BeOS, they still support the PowerPC
version because it helps to ensure the quality of their code - I'm sure
that Microsoft, at least Microsoft's engineers, will ultimately regret
abandoning PowerPC and Alpha for this reason.)
(By the way, this is one benefit of doing cross-platform development of
user applications too. You definitely want to get people who use
different processors to work with your code and if possible make it work
with other compilers than gcc and on different operating systems
entirely - it makes your code very robust).
Also, many of the kernel developers have been using the development
kernels on their own personal machines for a long time and often have
subjected them to heavy stress testing loads. There's been a lot of
time in development for kernel bugs to be found and fixed.
So it's not all that likely that you're going to have really
brain-damaged behaviour.
I'm so concerned about it not because I think it will be common, but
that if it happens it will be hard for the people it happens to to track
it down - it would appear that there was a bug in a program that wasn't
at fault, and that program's developers probably wouldn't have the same
kernel bug so they wouldn't be able to figure it out.
It would be best if such problems were found in testing rather than
in production machines, or on a machine owned by someone who wasn't an
expert user.
Lots of people think C++ is a screwed up language, and I must admit I've
seen my share of incomprehensible C++ code. But let's not mistake the
sins
committed in the name of the creator for his message. (I've found I can
write beautifully appealing code in C++.)
I highly recommend the chapters on software development and design in
The C++ Programming Language Special Edition to anyone, not just to
C++ programmers. (The first of these chapters is of most interest to
programmers in general, I think the next to anyone doing object-oriented
programming, and the third is specific to C++, as I recall).
Of most interest here is page 716, 23.5.3
Individuals:
Use of design as described here places a premium on skillful designers
and programmers. Thus, it makes the choice of designers and programmers
critical to the success of an organization.
Managers often forget that organizations consist of individuals. A
popular notion is that programmers are equal and interchangeable. This
is a fallacy that can destroy an organization by driving out many of the
most effective individuals and condemning the remaining people to work
at levels well below their potential. Individuals are interchangeable
only if they are not allowed to take advantage of skills that raise them
above the absolute minimum required for the task in question. Thus, the
fiction of interchangeability is inhumane and inherently wasteful.
I was pretty astounded when I read that. Not that Bjarne had said it -
I'd exchanged a few emails with him and had always been impressed with
his thoughtfulness. But I've read quite a lot of stuff about the
management of programmers, formal methodologies and the like, and this
was the very first time I'd found the advice that management should be
humane.
Where do you enter the humanity into your chart in Microsoft
Project?
Regarding not sleeping and coding all by yourself, please read my piece
on
Large Scale Individual Software Development on WikiWikiWeb. Because
it's on Wiki, you can edit it and add comments. For a little taste:
"So he decided to watch what the government was doing, scale it
down to size, and live his life that way."
- Laurie Anderson (quoted approximately from memory)
I spent about 11 months last year writing a vector graphic editor in C++
using the
ZooLib cross-platform
application framework. When I made a build for the client, I usually
delivered for both Mac and Windows from the exact same codebase (it also
supports BeOS and POSIX platforms such as Linux).
My client got the bright idea to require me to deliver a new build once
a week. She wanted, among other things, to be able to show the
investors we were making regular progress towards our goal by actually
demonstrating my development builds to them.
So for
several months my focus was on little more than getting them the next
build, with a few demonstrable, completely implemented and debugged new
visible features each week rather than making meaningful forward
progress on the program by doing more important things like laying
architectural groundwork that may not be immediately visible to the
user. I wasn't going to be caught dead delivering a build (way before
alpha) that crashed in front of an investor.
I
finally called her up in desperation and told her I simply could not
work under such conditions anymore.
The result of that? Development progress accellerated.
But there was no satisfying the client, who wanted the product as soon
as possible and constantly called to check on progress. They made it
clear they were upset at me taking four days off from work for my own
wedding on July 22. I went months at a time without a day off and
many times worked 24 hours at a stretch.
There was simply no satisfying them though. They were not technical
people and although they claimed they trusted my judgement and
explanations, I often had the feeling they were just pretending to
understand me when I tried to explain to them why the program they had
spec'ed for me was very difficult to write.
In the end, a substantial amount of money was due and I had just worked
a 29 hour day slaving to get them their feature complete beta. Although
we had a regular invoicing schedule they asked me to do them the favor
of not charging them until the beta was delivered. I had a highly
debugged program, complete but for one feature, which although it was
important, wouldn't be hard to write once I had a few days to rest.
I'd gone a long time without getting paid because of this favor I'd
granted them, and because of it they owed me a lot more money than they
would for the regular invoice.
At the end of this 29 hour day the client called to say that they
wouldn't give me the money they owed me until I delivered the product
feature complete. They didn't owe me for feature completion - they owed
me for several weeks of work I'd done before.
I told them if they did that "I'd terminate our business relationship"
and hung up the phone. A few days later I received an email
acknowledging that I'd ended the contract (which I hadn't - I only told
them I would if they didn't pay me).
Increasingly stern letters from my attorney have gone unanswered. I'm
trying to figure out a good collection agency in San Francisco. I'd
really meant to put them into collections just before Christmas to
return the special Christmas gift they gave me but I was too busy trying
to recover from the mess.
That's what you get for not working like a normal human being. Beaten
to death and kicked in the teeth as thanks.
They seemed like such nice people when we started the project.
The best advice I can give anyone here is to learn to be a good judge of
character - I have a hard time with that, but my wife is very good, and
many times I've wished I'd listened to her sooner. Anyone can seem like
they're honest, good-natured people when things are going good, but will
they still be there to support you when, say, the high-tech stock market
collapses, taking with it the investment community's interest in
internet startups?