I've been thinking about static analysis for finding bugs off and on for the past 18 months or so; recently, I've been looking for a good open source static analysis tool. Unless I've managed to miss it, ISTM there isn't one.
Uno is the closest I've found, but it is pretty unpolished, and I don't believe it is DFSG free.
sparse, Linus' checker, may or may not be cool; I've tried to see what it's capable of, but wasn't able to make it catch anything more significant than minor stylistic errors in C code (e.g. extern function definitions, 0 used in a pointer context (rather than NULL), that sort of thing). (Side note: sparse doesn't even have a website, and it's primarily available via bk. Does Linus not want people to use his software?) I'll definitely take a closer look at this, anyway.
There are some more academic tools — like Uno only even less practical). There's also Splint, but last I tried it, it emitted way too many bogus error reports, and required tons of annotations to be of any use.
Some random thoughts about the design of an open source static analysis tool:
Speaking of tools for finding bugs, I've got to find some time to make valgrind understand region-based memory allocation.
I took about a month off work. I was in Perth for about two weeks to celebrate Christmas with my aunt's family, and then in Cairns for about 10 days, doing some scuba diving with a friend who was over from Canada.
PostgreSQL
Started back at work last Monday. 8.0.0 got released, which is great -- this release has a ton of new functionality that I'm really happy about.
The tree is now open for 8.1 work, so I got a chance to check in some stuff that's been sitting on my hard drive for a while. Sped up rtree scan performance by about 10%; I have similar patches for GiST which I'll commit soon. The GiST stuff also overhauls memory management: GiST user-provided functions will now always be invoked in a short-lived memory context, so people implementing GiST-based indexes won't need to worry about freeing palloc'ed memory. One of the lessons of working on the PG source: region-based memory allocation is a Good Thing.
While cleaning up various things in PL/PgSQL (mostly memory management related), I noticed a buffer overrun in the parsing of refcursors. Patched that for 7.4 and 8.0.
I took a look at adding support for GCC's profile-guided optimization to the build system. I'm a little confused -- why don't more projects take advantage of this? Particularly when, say, building RPM packages, it would make sense to trade some extra compile-time for a few % improvement in runtime performance. On the other hand, I ran into some problems actually using the PGO support (e.g. this), so perhaps that's one reason PGO support hasn't (AFAICS) taken off.
robocoder: Thanks for mentioning the pending patent on ARC. Unfortunately, that came as quite a surprise. I passed on the bad news to the pgsql-hackers list, which started a spirited discussion of the topic. I'm not sure what the resolution to the problem is going to be; personally I think we ought to replace ARC with a simple LRU scheme in 8.0.1, and worry about a better, unencumbered replacement for 8.1. But in any case I'm glad we found out about the problem sooner rather than later.
Books
Just started reading Paul Graham's Hackers and Painters, but I'm really enjoying it so far. I also have Conrad Black's biography of FDR to start (1300 pages, yum).
An interesting statistic from the Economist:
One of the best statistics of the campaign is that people worth $1m-10m supported Mr Bush by a 63-37% margin, whereas those worth more than $10m favoured Mr Kerry 59-41%.
Robert Kagan's article Power and Weakness in Policy Review was written in 2002, but it's still a fascinating read. Choice quote:
Today's transatlantic problem, in short, is not a George Bush problem. It is a power problem. American military strength has produced a propensity to use that strength. Europe's military weakness has produced a perfectly understandable aversion to the exercise of military power. Indeed, it has produced a powerful European interest in inhabiting a world where strength doesn't matter, where international law and international institutions predominate, where unilateral action by powerful nations is forbidden, where all nations regardless of their strength have equal rights and are equally protected by commonly agreed-upon international rules of behavior. Europeans have a deep interest in devaluing and eventually eradicating the brutal laws of an anarchic, Hobbesian world where power is the ultimate determinant of national security and success.
Nothing can be more fallacious than to found our political calculations on arithmetical principles. Sixty or seventy men may be more properly trusted with a given degree of power than six or seven. But it does not follow that six or seven hundred would be proportionably a better depositary. And if we carry on the supposition to six or seven thousand, the whole reasoning ought to be reversed. The truth is, that in all cases a certain number at least seems to be necessary to secure the benefits of free consultation and discussion, and to guard against too easy a combination for improper purposes; as, on the other hand, the number ought at most to be kept within a certain limit, in order to avoid the confusion and intemperance of a multitude. In all very numerous assemblies, of whatever character composed, passion never fails to wrest the sceptre from reason. Had every Athenian citizen been a Socrates, every Athenian assembly would still have been a mob.
-- James Madison, The Federalist #55
There was an interesting thread on the GCC development list about what kind of optimizations can legally be performed on "explicit storage" (e.g. malloc in C, operator new in C++). Various folks raised concerns about how this changes programmer expections and whether it is allowed by the C or C++ standards. Interestingly, Chris Lattner pointed out that LLVM actually implements this optimization, at least for malloc (as usual, C++ makes things more complicated, but even then LLVM could theoretically perform the optimization at link-time).
Since my last blog entry, I:
So, a lot is new :)
Okay, I'll admit it: I'm completely, helplessly addicted to editing Wikipedia. I've always thought the project was cool and I've contributed a few edits in the past, but the habit has really gotten out of hand recently:
Winding up the summer
I took a break from the OSS world this summer to do another internship at a commercial software firm in Seattle (I did the same thing last summer). The group I was working in was doing some really cool work, although unfortunately the details are NDA. As fun as that was, I must confess it's a pleasure to get back to working on OSS.
Poker
I'm getting increasingly annoyed playing low-limit hold'em at casinos. Like any good geek, before playing poker for sizeable amounts of money I read a few books on the subject and learnt how to play "properly" -- tight and aggressive. "Get your money in when you've got the best of it, protect it when you don't," as they say. While I think I'm playing well, the results haven't been favourable: I've ended down the last four times I've gone to a casino. The most annoying thing is that I can't find any fault in my play -- given a second chance to play all those hands again, I'd play them mostly the same way. I'm tempted to blame my losses on bad luck / cold cards, but of course that's always easy to do. On the other hand, I've been cleaning up playing no limit online, so at least that's something.
Garden State
I saw Garden State recently and absolutely loved it. Natalie Portman stole the movie, I think. I bought the soundtrack the next day, which is great too. See this movie!
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!