12 Aug 2002 Bram   » (Master)

Formal Methods

The utility of formal methods for practical code writing is an interenting debate. As you might expect, I have some thoughts on the matter.

Code winds up much better if it is rewritten pragrmatically with experience. I think there isn't a single line of BitTorrent which hasn't been rewritten at least twice. If I had been using formal methods, the whole program would essentially have been proven once for each rewrite. Much easier is to write the whole program once using more direct methods and then audit it.

The much-vaunted slow programming of the space shuttle is mostly a triumph of beurocracy. Switching from a B&D to a write-then-audit methodology would drastically reduce costs and result in better code.

While I'm still not convinced of the practical utility of formal methods, I'm completely sold on unit tests. A unit test is a test which exercises some code without requiring the rest of the program be linked in. The approach to writing them is to make them all pass/fail, and have a script which runs all unit tests and reports failures. Whenever a unit test is failing, fixing it becomes the top priority. Anyone whe doesn't fix their unit test breakage with the excuse that getting all the tests to pass is impractical anyway should be fired, (or at least, have their commit access revoked).

While formal methods clearly slow down development, unit testing actually speeds it up by reducing debugging time, making it not only practical but easily justifiable.

Writing good unit tests is more art than science. The general idea is to try to make the unit test code simpler than the code it's testing, and also to make the test code assert statements from a fundamentally different perspective than the code itself. Formal methods, by contrast, wind up repeating the code a lot, and thus aren't good at catching specification bugs.

After two years or so of writing unit tests for most of my code, I find the thought of debugging something without them absolutely horrifying, and can't bring myself to even toy with newly written code without writing some tests first (I generally don't test bootstrap and UI code though, since those aren't straightforwardly encapsulatable).

The above approach to unit tests is taken from extreme programming. However, extreme programming people seem to think that unit == class, which is wrong, and their encapsulated APIs seem to not be nearly as hermetically sealed as mine. Units can span several classes, and if they do those classes should internally have complete access to each others's data, while the outside world should only have access to data and methods if it is explicitly granted.

The bugs I've run across in BitTorrent generally fall into the following categories, each of which occurs about an order of magnitude less often than the one before -

  1. Bugs found in proofreading. My skill at this depends a lot on how tired I am.

  2. Bugs found writing unit tests. This is what really causes my debug time to be so small.

  3. Bugs which manage to slip past unit tests but cause a stack trace when invoked and are fixed within a few minutes of being found.

  4. Bugs which involve subtle misunderstandings of third-party APIs. There are far too many of these, since basically all third-party APIs are written horrendously (wx, select(), open()), but I generally get them fixed eventually, once I understand what the actual semantics are.

  5. There was one (count 'em, one) bug which managed to slip past all the testing and remain in the program for months, resulting in disastrous performance. The wrong integer was getting passed in by bootstrap code, and the generally robust nature of the underlying APIs caused the problem to be manifested as extremely poor performance rather than an outright bug. Eventually I observed some behavior which unambiguously indicated a bug, and managed to fix the problem after a few hours (and a lot of whining on my part).

    Notably, this problem could just as easily have slipped past formal methods, but would have been caught very easily had I been religious about passing parameters by name rather than position.

    (The specific bug, in case you're curious, was that the number of pieces to pipeline from any one peer was set at 32768 instead of 5, often resulting in the entire file being requested from one peer. Since BitTorrent usually has a policy of not requesting the same piece from more than one peer at once, there were other connections which were going unutilized since they had nothing else to request. This problem would have been even furthur masked had endgame mode been written at the time. Robust APIs can eliminate the possibility of a huge number of bugs, but the bugs which remain can sometimes be very strongly masked.)

Gollum Gollum

Someone posted an advogato diary entry (I'm not sure who, it's scrolled off recentlog) linking to a paper on exhaustive search for golomb rulers. Unfortunately, the site was down and archive.org didn't have the paper, I'd like to read it.

Conjecture Proven

Robert M. Solovay sent me a proof of my conjecture about prescient busy beaver numbers. Here it is, (slightly rearranged) -

I can prove your conjecture [and much more]. For example, there is a d such that

pbb(x) > bb(bb(x)) for all x > d.

This is stronger since bb(x) > h(x) for x large for *any* recursive function h.

All Turing machines considered in this letter have one two-way infinite tape and two symbols, 1 and "blank".

Let me state a series of results and then prove them:

Definition: K is the set of Godel numbers of TM's that started on a blank tape eventually halt. (A Godel number is a natural number encoding of a turing machine).

bbk is the Busy Beaver function for oracle TM's that use the oracle K.

It follows that for any function h which is recursive in K, we have bbk(n) >= h(n) for all but finitely many n.

In particular, bbk(n) >= bb(bb(2^n)) for all but finitely many n.

Assertion: There is a positive constant C such that:

  1. bbk(Cn * log n) >= pbb(n) for all but finitely many n.

  2. pbb(Cn * log n) >= bbk(n) for all but finitely many n.

This is the assertion whose proof I will sketch. But a little more work could improve this by replacing Cn * log n by just Cn.

Let's start with the proof of 2). Let M be a TM with at most n states [which uses an oracle for K] that halts in precisely bbk(n) states. We will design an ordinary TM, N, with O (n log n) states that visits state 1 a finite number of times but more than bbk(n) times.

It clearly takes O(n log n) bits to specify the table of the TM M.

N will first use O(n log n) states to write this description of M on its tape.

Comments: Using a trick of Chaitin, one could get by with just O(n) states here. It is at this point that the O(n log n) comes in. The remainder of the proof uses only O(1) states.

Officially, our TM has only two symbols and one tape. But it can simulate a k-tape n symbol tape with r states with Cr states of its own. [The C depends on k and n.] I find this remark useful in thinking through the details of programming N. I will, however, suppress these details in this sketch.

N will construct a series of approximations to the set K. The first approximation is that K is the empty set. It then simulates M using this approximation to K till one of two things happens:

(a) the simulation halts;

(b) for some n that the simulation queried the oracle about, N discovers that the original oracle gave the wrong answer.

If (a) happens, the program continues to look for an event of type (b). If the program ever finds an event of type (b), it redoes the simulation of M's action using the newly updated version of K.

Eventually, the program will find the correct computation of the M program from the true K.

We arrange to enter state 1 for any step in any of the simulated programs.

Because we eventually converge on the true computation of M the state 1 is only entered finitely many times. Because we eventually find the true computation, state 1 is entered more than bbk(n) times.

The proof of claim 1 is similar. We let now M be a TM of at most n states that enters state 1 precisely pbb(n) times.

Here is an oracle TM, N, [using an oracle for K] with O(n log n) states and that eventually halts but runs at least pbb(n) steps before halting.

N first uses O(n log n) states to write down a description of M.

It now simulates M. But before doing any step of M it asks "Will M ever again enter state 1". [It can settle this question using its oracle for the halting problem.] If N ever gets the answer NO to this question it immediately halts.

What I find interesting about this proof isn't so much that it shows that bb grows less than pbb, which seemed fairly obvious, but that it shows that pbb and bbk grow at essentially the same rate, which negated my intuition that pbb grows much more slowly than bbk. I always find unexpected results more interesting than expected ones.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!