Advogato: Blog for bonzini

Time to write something new...

I have done some work on sed lately, there were some bugs in bad interactions between s///NUM and the LHS being possibly empty. sed has a lot of corner cases like this where it is supposed to Just Work, then working around a strange behavior requires one to work around the correct behavior where the strange behavior was right and so on. You easily end up with 200 lines of ifs!

Besides, sed is now multibyte-clean and I'm ready to release 4.0.7 (the final 4.0.x release) and 4.0a (a first alpha for 4.1). I don't have much time to do them and I have more priorities such as fixing up that Assembly memcmp...

Today I'll spin a prerelease for GNU Smalltalk 2.1! I hope to release around Easter. I really like this release. I took time to insert many (optional) sanity checks, so notwithstanding some big architectural changes, it turns out that it is really stable and they actually fix bugs in the old code: especially the overhauled Processes, which were meant to allow debugging, but did fix some nasty SIGSEGVs in the browser. The browser grew up to something fast, stable and usable enough for production work... even though I am mostly a vi guy (elvis, not vim!) I do use it sometimes.

I'm also very pleased with the garbage collection, it works like a charm -- yesterday I suspected a GC bug and was ready to spend an hour staring at the thousands of lines that GC outputs in debugging mode, but it turned out that the bug was actually in years-old code and not in the GC which is only three months old! That's the code that computed stack heights to find out how big to make heap-allocated activation records, and this code turned out to be very bad (I must say that I wrote it about 3 weeks after I picked up C...) so I made a general rewrite which includes more sanity checks yet it is even faster.

Ah, BTW, the XML idea that I wrote about in my first diary entry was developed to a nice non-validating parser written using big big big regular expressions (like the RFC821 parser in Mastering Regular Expressions), which is quite fast and very object oriented (given that it is PHP). I wrote some very nice SAX handlers, including the ability to write XML and HTML, and to save and replicate SAX events which is the basis for templates. Maybe some day I'll finish it, in the meanwhile we've tried out the parser and its companion classes with a colleague at work, and he liked it a lot.

I'm going to quit this work in a week (after we finish a big big big project on Monday), then I'll start my thesis. I have not heard anything from the professor, but I hate to pick up the phone as I don't want to be unkind... It should be about adaptively optimizing Smalltalk bytecode; it sounds like a big job, I have to redo some parts of the GNU Smalltalk JIT and add polymorphic inline caches, and then write a lot of Smalltalk code for the optimizers, but it can be done and looks exciting (especially when compared to web programming...).

That's it for today.

28 Mar 2003 bonzini » (Master)