20 Jan 2004 bonzini   » (Master)

Lately I have been working on my thesis; I can now run Java benchmarks in GNU Smalltalk even with the just-in-time compiler enabled. I have to print it on Friday and give it in by Wednesday.

Still a long way from releasing 2.2, though. The release cycle is indeed longer than for previous releases, maybe by end 2004. I still have to do some work on security (but maybe I'll just keep it on my hard drive as a patch and drop it, it's quite hard stuff), and since I switched the bytecode set I'd like to do escaping variable analysis and be able to deallocate activation records LIFO more often than now. It's one of the worse things in GNU Smalltalk, really.

Made a lot of work on the GNU regex implementation in glibc 2.3, it is now less braindead than it used to be -- many thanks to Jakub Jelinek and Ulrich Drepper for supporting this, really. For example, it prunes anchored searches, without trying to match ^a all along the strings. It does not care about multicharacter collating symbols (like the Danish AA) in the C locale which has none. It has some microoptimizations, removing two-three parameters from some often called functions and with completely rewritten routines to handle sets of DFA nodes. And it does have some bugfixes, a couple of which caused quadratic bottlenecks. But, it is still much slower than PCRE and grep. Sigh. :-(

sed 4.1 is shooting for POSIX conformance, I'm quite close to it: I have a fix to apply yet that makes line numbers be checked without the need to be enabled/disabled as regex addresses, it fixes case like the following

You'd expect that it printed lines from 6 on, but the branch prevents the 1,5 address from being disabled and -- surprise -- line 9 is not printed!

I bought two books on Amazon. Code reading is quite cool though most of the tricks it shows are quite basic, because it shows really a lot of code of various quality and prepares you so you have fewer some surprises; I liked the chapters on large projects. Debugging: nine indispensable rules is much better, again it is something everybody is doing sometimes, but it is presented in a convincing way so that you really get to follow the rules. I rarely if ever buy computer books, but I am really satisfied about these two.

I prepared my first gcc patch! It fixes a problem with unsigned int * long long multiplication. I still do not have copyright assignments, but I hope the patch is small enough to be accepted. However, long long still sucks: gcc cannot optimize 64-by-32 division, and performs simply awful register allocation. Bah. I just had an itch to scratch to see if I could fix the one pessimization that seemed easy to attack, and I did. :-)

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!