3 Apr 2006 rmathew   » (Master)

ECJ for GCJ: Still in limbo
It has been almost a month since Tom formally proposed integrating ECJ in GCJ to the GCC Steering Committee (SC). There has been no word from the SC yet on this request. However, the SC did ask the GCC developers to avoid gratuitously including source code from external projects in GCC. One consequence of this for GCJ was the removal of fastjar from the GCC source tree. I'm not sure if the SC's decision was coincidental or in fact a result of deliberations triggered by Tom's request.

More Front Ends in GCC
One of the great advantages of structuring a compiler such that the front-end, the middle-end and the back-end are relatively independent is that if you write M front-ends and have N back-ends, you get M*N compilers "for free" assuming you have a good enough intermediate representation in the middle-end. This idea was discussed as far back as the 1950s and UNCOL was an ambitious effort towards this goal. GCC is a stellar example of such a compiler - it supports C, C++, Java, Ada, etc. "out-of-the-box" and can target a whole bunch of platforms. You implement a language front-end for GCC and you immediately have a compiler for that language for a whole lot of platforms; you implement a target back-end for GCC and you immediately have compilers for several languages for that platform. Of course, this is grossly oversimplified, since you have to usually port the language runtime to a platform too or since your language might strain the GCC intermediate representation or expose latent bugs in the middle-end making the effort rather difficult. But the overall idea still remains valid.

The GNU Pascal Compiler (GPC) guys recently proposed an integration of GPC with GCC (in the same source repository, but on a different branch - weird). Some day, the GCC Scheme Compiler (GSC) guys, the PL/I for GCC guys, etc. might also want to integrate their front-ends with GCC. Having more front-ends in the GCC source tree itself means that middle-end changes do not inadvertently break these front-ends, latent middle-end bugs and unwarranted assumptions are exposed, general GCC enhancements are automatically applied, etc. So it's a good thing for GCC, in a way.

However, I personally think it is not a good idea. The GCC mainline is already quite bloated with a number of languages and runtimes and building all of the languages and their runtime libraries (thank you Sun for regularly increasing the bloat in the "standard" Java runtime with every release of the JDK) takes quite a while even on a decent system. Having more languages and their runtimes within GCC will only exacerbate this issue. I personally also feel (though I have no real practical experience in this area) that it does not let the optimisers make assumptions that they can use to perform stronger optimisations. A recurring problem in this area is the folding of constants, where languages like Java specify a bit too much with respect to what can be folded and how it should be folded.

On a slightly different note, the GSC guys have also created a "Hello World" front-end for GCC that shows you how to build a front-end for GCC for your favourite language.

On an entirely different note, I have ended up writing 3,000 lines of text in the user manual of a 4,000 line programme (both rough "wc -l" figures)! Either the manual is unnecessarily verbose or the programme is too complex.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!