Older blog entries for bonzini (starting at number 11)

To complete my previous findings on Libtool, I must say that indeed the libtool people provided me with a solution to prevent the C++/Java tests from appearing in configure (it will be in libtool 1.6 but I backported it) and that I found a way to avoid that the C tests are performed twice (I have contributed it to libtool and it will also be in 1.6).

Also, I just updated to the last Automake 1.8 beta, and the new implementation of aclocal saved over 300k on my tarball, so not everything's wrong in the Autotools after all... :-) (aclocal now includes m4 files that are in the distribution directory using m4_include, and not literally in aclocal.m4).

Yesterday I decided that it would have been cool if GNU Smalltalk was compiled as a shared library. With the support for ELF visibilities that is in gcc 3.3, that should have been possible without much performance degradation (note that the installed VM still is linked statically though). Then I also decided that I should compile the shared library with -fomit-frame-pointer because I do need the register that is lost for the GOT pointer when using position-independent code.

The only sane way that came to mind to supply that flag only for the PIC case, and then only for a particular library, was to use libtool 1.5's tags: they are born to support multiple languages, but the presence of the disable-shared tag suggested that they could be put to such use. To summarize, there is absolutely no documentation on how to use tags, not to mention defining new ones (I wanted my tag to be based on the standard tag for C of course, without duplicating all the code in libtool.m4). I have lost a whole afternoon trying to do this, and now that I finally succeeded, what I came up with is a bunch of awful-looking m4 hacks (that luckily can be encapsulated in a separate macro) and with a 770kb configure that is basically doing exactly the same tests twice!

Ah, and I was forgetting to say that it used to top the megabyte until I found out that it was including in the configure file the stuff for C++, Java, and Fortran 77 without ever executing it. :-)

Did a nice thing recently... I rewrote GNU Smalltalk's bytecode interpreter, the new one has a new bytecode set which supports 192 superbytecodes, so it is 10-50% faster (50% on the P4, finally found a use for that trace cache!!!). I wrote some cute proggies to achieve this: a virtual machine generator, and a superoperator search program that embeds gperf. Sooner or later you'll find it on alpha.gnu.org or even later as part of GNU Smalltalk 2.2. Did I mention that GNU Smalltalk now has GTK+ bindings too? :-)

I just finished a patch for GCC's libffi and I am waiting for comments.

I am also going to add some expected failures to sed and release 4.0.8 with them so that the glibc people will hopefully come and fix them. They made that bad choice of including the new regex matcher in glibc 2.3, now they should mend it. Me and the gawk author are getting a lot of spurious bug reports because of that.

Upload time. I've been quite busy and delayed some uploading (of GNU Smalltalk 2.1.3 and my netcat's second alpha).

This release of GNU Smalltalk has quite a lot of bugfixes, it might even be the last 2.1.x release. The future (2.2) is quite closely related to my Master thesis which I just started. Basically I am going to do three things.

The first is to clean it up so that one can write down a VM specification like Java's. I already did some of this, now the VM specification for example does not depend on having ContextParts which many Smalltalks do not have. I also found a couple of places where things could be made much more cleaner with half an hour's effort. This can only help.

The second is to add security. That's a bigger deal. So far, I have added a mechanism to give `trustedness' attributes to contexts (activation records) so as to make the thing fast, and I wrote a nice little bytecode verifier. I also wrote a small Flex/Bison-based program that automatizes the decoding of bytecodes (the verifier had two places where I was decoding bytecodes, and Smalltalk had other four, and I was starting to hate this code duplication) and which I like very much.

The third is to implement Java on top of Smalltalk. I think the results of this will be quite interesting: I expect big big big slowdown in FP math (Floats are boxed in Smalltalk and primitive in Java), but improvements in code that uses OOP and interfaces a lot. Besides my uni is doing a lot of research in code mobility and stuff like this, where the increased reflection capabilities of Smalltalk can only help.

Last but not least, I noticed that fortune does not produce random fortunes at all. It tends to output the first fortune in the file quite often. So what's better to learn some Perl than to rewrite it? Here is my effort, I'll be very happy to get some comments on it.

1 Jun 2003 (updated 28 Jun 2003 at 14:51 UTC) »

Yesterday I took up again and finished (at least for now) the cleanup of netcat. The result is available here. Among other things that I did is IPv6 support and using poll(2) instead of select(2). Note that there is another renewed version of Netcat which is included in OpenBSD. These two are not related; my version does keep most of the original source code (albeit with renamed variables/functions and updated/edited comments -- actually more in spirit than otherwise), while OpenBSD's is said to be a complete rewrite. OpenBSD's version lacks some features (hexdump, line-by-line, multiplexing) but has some more (AF_UNIX sockets and SOCKS support). I hope to get feedback on this.

I made some more patches to Autoconf, all the preliminary plumbing is in place and I'll push more after 2.58 is released.

Done a coupla cool university projects for the Operating Systems course: a kind of UI for the Unix shell written as a Bourne shell script (cool, with dialog -- and made me discover the fabulous shell syntax "${arr[@]}" :->), and a little thingy doing message passing between parent and child processes (I did it with message queues: jeez, what a braindead API, I have 120 lines out of 350 only for the wrappers around them! another mate did it with AF_UNIX sockets which is indeed a lot easier to do and more readable).

Long time no post... :-)

I've done some more ports of libsigsegv lately (including Cygwin, which was quite cool as you have to deal with the innards of Win32 exception handling...), and GNU Smalltalk is about to reach 2.1.2 (I do patch releases often when big changes are in their infancy, because then I have much bigger exposure) which I'll release as soon as I find time to test the MacOS X port of libsigsegv once more. Bruno Haible in the meanwhile made a lot of Linux ports (m68k, HPPA, completing my IA64 work, and so on).

The netcat project is not going to leave my harddisk soon, but I did remove the gotos and the source code is in a better state. Next step would be to do IPv6, I'll probably do that around the same time when I'll add IPv6 to GNU Smalltalk (that's when I'll fork for 2.2).

I did some real kewl hacking on Autoconf: I started adding usage of shell functions to it!!! I have 1,5 megs of configure scripts in GNU Smalltalk so I am quite interested in this, and I already got up to 25% improvements in the size of configure scripts only by using functions for seven AC_CHECK_ macros (I still have to do some timing); the Autoconf maintainers (Akim Demaille and Paul Eggert) seem to endorse my work and to consider my approach a sane one, so I am quite optimistic even though I'll probably have to learn more m4 to have it accepted.

If accepted, this work might make other changes to Autoconf possible, with ramifications up to an eventual Autoconf 3 release; it would be a major contribution to free software and I am quite proud of it. Initial results can be found in the Autoconf mailing list archives for May 2003. I already sent a good deal of patches that make everything in good shape for adding shell functions and testing the new code.

Jeez, I am going to release GNU Smalltalk 2.1 Real Soon Now!!! I enjoyed HP's TestDrive program, it gives away lots of accounts on Alphas and IA64's and I used that to make it 64-bit clean and fix some lossages due to bad OS libraries (did you know FreeBSD's inttypes.h lacks intmax_t?)... very good.

I also started a pet project. I am going to take netcat 1.10, autoconfiscate it (which I already done), and make the code more readable (which I just started to do). How awful does that code look like! It will surely look too serious when I finish and less H4X0R-ish, but I hope it will be better as a programming example since it shows a good deal of socket tricks -- besides, I think the author exceeded a bit in disseminating the f*** and s*** words in the remarks...

In the meanwhile, my Darwin code has been reviewed & installed into libsigsegv.

I spent Monday afternoon porting GNU Smalltalk to Darwin -- the biggest job being the porting of the libsigsegv library which is used for the generational GC. Gee, how hideous an environment!

It is as far from the standards as they could make it. Everybody has stack_t, they have struct sigaltstack. Everybody includes <signal.h>, they want you to include <sys/signal.h> as well. Some constants are named differently, and so much things are only available on the Mach level -- luckily I found the info on how to decode PPC instructions in the Boehm garbage collector, and some more source code in XEmacs. The port is a 20k patch while other OSes are 2kb at most. Bah. I am going to submit it to Bruno Haible for inclusion in the master libsigsegv sources.

28 Mar 2003 (updated 28 Mar 2003 at 10:43 UTC) »

Time to write something new...

I have done some work on sed lately, there were some bugs in bad interactions between s///NUM and the LHS being possibly empty. sed has a lot of corner cases like this where it is supposed to Just Work, then working around a strange behavior requires one to work around the correct behavior where the strange behavior was right and so on. You easily end up with 200 lines of ifs!

Besides, sed is now multibyte-clean and I'm ready to release 4.0.7 (the final 4.0.x release) and 4.0a (a first alpha for 4.1). I don't have much time to do them and I have more priorities such as fixing up that Assembly memcmp...

Today I'll spin a prerelease for GNU Smalltalk 2.1! I hope to release around Easter. I really like this release. I took time to insert many (optional) sanity checks, so notwithstanding some big architectural changes, it turns out that it is really stable and they actually fix bugs in the old code: especially the overhauled Processes, which were meant to allow debugging, but did fix some nasty SIGSEGVs in the browser. The browser grew up to something fast, stable and usable enough for production work... even though I am mostly a vi guy (elvis, not vim!) I do use it sometimes.

I'm also very pleased with the garbage collection, it works like a charm -- yesterday I suspected a GC bug and was ready to spend an hour staring at the thousands of lines that GC outputs in debugging mode, but it turned out that the bug was actually in years-old code and not in the GC which is only three months old! That's the code that computed stack heights to find out how big to make heap-allocated activation records, and this code turned out to be very bad (I must say that I wrote it about 3 weeks after I picked up C...) so I made a general rewrite which includes more sanity checks yet it is even faster.

Ah, BTW, the XML idea that I wrote about in my first diary entry was developed to a nice non-validating parser written using big big big regular expressions (like the RFC821 parser in Mastering Regular Expressions), which is quite fast and very object oriented (given that it is PHP). I wrote some very nice SAX handlers, including the ability to write XML and HTML, and to save and replicate SAX events which is the basis for templates. Maybe some day I'll finish it, in the meanwhile we've tried out the parser and its companion classes with a colleague at work, and he liked it a lot.

I'm going to quit this work in a week (after we finish a big big big project on Monday), then I'll start my thesis. I have not heard anything from the professor, but I hate to pick up the phone as I don't want to be unkind... It should be about adaptively optimizing Smalltalk bytecode; it sounds like a big job, I have to redo some parts of the GNU Smalltalk JIT and add polymorphic inline caches, and then write a lot of Smalltalk code for the optimizers, but it can be done and looks exciting (especially when compared to web programming...).

That's it for today.

Done some work on GNU Smalltalk. The maintainer for the Debian package has ported it to MinGW and I gave him a hand; this was also a good occasion for restructuring the OS dependent stuff and providing neat encapsulations for anonymous mmaps.

I also got several patches for the graphical development environment (aka class browser), and put the FTP clients back in ready-for-release shape.

2 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!