Older blog entries for pphaneuf (starting at number 81)


Whew, what a weekend. Parties, poking some people's bellies with chopsticks (you read that right), our dog got sick, etc, etc... My head is spinning.

chromatic: You got that right.

Threads, state machines and I/O

mbp, could you confirm the source for that quote from Ulrich? I tried googling it, to no avail... Thanks!

By the way, I think you are right on about explcit vs implicit sharing. As far as "performance hack" goes, threads are faring very poorly, breaking cache lines on a single processor and increasing inter-processor chatter on SMP. I doubt there is any case of such a silly OS that would also support NUMA, but having a thread migrated to a "far" CPU would probably be ultra-painful.

One of the problems is that we don't have asynchronous I/O on (normal) Unix. We only have synchronous, with non-blocking on some cases (support for file-based file descriptors is conscupisously missing). You can go some distance with that, a bit more if you play clever tricks, but asynchronous I/O is wanted. Thanks to bcrl, this is coming in the next major Linux release. Yay!

lukeg: Like Ingo Molnar, I think it comes from Win32, that has really poor performance caracteristics for starting processes. See his answer to question #6 of this interview on Slashdot. Ideas like threads for GUI applications that are not CPU intensive are in my opinion so stupid, as you and Ousterhout (through async) point out. In such applications, you always want to do something in reaction to an event, even if it is only the passing of time!

MichaelCrawford: a single CPU is a state machine, two CPUs are two state machines, therefore, to use two CPUs, you have to run two state machines concurrently. Basically, the optimal setup is one "execution context" (be them threads or good old processes) per CPU.

Some people might point out Intel's Hyperthreading as a case where multiple threads per CPU might make sense, but this is false. There is a lot of shared resources involved in this hyperthreading feature (there isn't the double of each resources like execution units and floating point units), and in particular, there is one shared cache for all the threads on a particular CPU (so you have multiple threads with possibly very different code and data battling for cache space, aka "trashing"). Hyperthreading, IMHO, makes running multiple threads per CPU better than without that features, but not better than having a single efficient thread per CPU (it's a kind of workaround).

The difference between simply doing a multithreaded program and having multiple independent state machines are multiple. Don't share all of the virtual memory space, use explicit messages to cause state changes (through IPC). Every time that you write to a shared page on an SMP system, the processor doing the writing sends a page invalidation to the other CPUs caches. If you use implicit sharing (threads), you never know, even if you take great care not to share data structure between the threads, you might have a counter variable that you update really often in the same page as that data structure that is used in another thread (on another CPU), causing no end of inter-processor traffic and killing the cache. With explicit sharing (processes with shared memory, for example), this doesn't happen by accident, only if you really update a page that has to be then invalided on the other CPU.

So using multiple processes (rather than threads, and still only one per CPU) and using message passing (can be optimized using shared memory, if pipes or Unix domain sockets are too slow for your taste) is best.


Ah, so many things I'd like to do to and with XPLC. Talking about the I/O, state machine and multithreaded stuff kept reminding me. I'm still working on it, but more occasionally (because I'm on a project with tight deadlines at work), but I'll be back. The good thing with projects that have tight deadlines is that you (roughly) know when you'll be done. ;-)

And I'll be back with a vengeance! Seriously, I've got some pretty cool stuff going with XPLC, I just have to do the grunt work of coding, since I have been spending a lot of time designing and thinking about how to do things in little pieces of spare time, like when I ride the subway.

XPLC is so neat, there just won't be any excuse not to use it!


I have been slacking off a bit on XPLC, being too busy, but I'm also full of ideas for components and applications that I really can't wait to hack on. I still have to finish at least the support for component categories first, this is key to the expandability I want. I also have to change the way that the module loader works, to enable unloading modules.

Plucker and other Palm things

I wish it was much simpler to sync my Pluckered daily web stuff. I didn't try their desktop app yet, but anyway, the problem is with syncing itself. The installer conduit in ColdSync won't overwrite a database, so just running the crawler/distiller and dumping the output in ~/.palm/install won't do.

I think I'll try to resurrect my Plucker ColdSync conduit, but I'm feeling more and more like the latter is suckier than I thought. For example, it doesn't set the groups properly when it switches identity to that of the user (according to id, it changes uid and keeps the groups root has, but strangely, I can't create a file in my home directory).

Could other people tell me of their Palm syncing patterns, especially Plucker users? I just want to hit my cradle's sync button and be done, even if no one is logged in, is that too much?


This whole holidays season is half good, half curse. That's all I have to say on this.

I have good thoughts for David McCuscker. Take care!


I have never handled an FM3A, but having learned with a Canon FTb, I know a good, solid manual body is hard to beat, even though I love my Canon Elan 7e.

An old TLR is a lot of fun too, like a Yashica 124G (or something like that), for example. Drop in a roll of XP-2 (or Tri-X, if you develop yourself) and have some handheld fun!


Whew, got everything working on Solaris, got shared object linking working on Mac OS X, I'm in the process of writing a dyld version of the XPLC dynamic loader, but it currently works without dynamic loading just fine...


I found the reason why I had a weird crashing bug in the XPLC test suite on some versions of Linux. Seems that an older glibc crashes upon dlopen()ing an ELF executable. Updated the FAQ with the most recent information. I suppose I'll just tell people to update their glibc...

Weak symbols seem to be not quite all what I thought they'd be. It seems that identical redundant symbols from different compilation units just get concatenated? I might just as well use "static" variables (I was trying to save space)!

Also managed to get my makefiles auto-dependencies working both with gcc2 and gcc3. I think I'll get around to doing a new release tomorrow.

jbucata: Funny that you pointed me to the "note on distributed computing", did you notice that I had it linked in my Advogato article? :-)

All of this is related. XPLC lacks object remoting, so making remote objects available locally in a transparent way is not possible, because XPLC lacks the indirection needed. So the problem of remote objects is left unsolved, but in exchange, I didn't add too many levels of indirection, so performance is still good.

Exceptions would require an indirection, as would "supporting" threads. XPLC supports threads, in the sense that it is going to be threadsafe (altough it isn't at the moment, it is accounted for in the design and will be added later), but it will not cover up for thread unsafe components like COM does (with its apartments and proxying), because that would require an indirection and lots of complexity.

You liked the fact that I let people choose, but unfortunately, to do these things transparently, you need to have hooks that would need to be in more indirection. I don't think you could add exceptions, because really, exceptions are just a more transparent version of return values (in fact, exceptions in COM/DCOM are in HRESULT return values). So you can't really add exceptions in an optional package.

My plan for remoting is to use messages rather than direct method invocations. When you use a direct method call in XPLC, you can rest assured that the method call overhead is O(1). Remote objects will have to be accessed through a separate message queue mechanism, which is how the abstraction doesn't leak: when you see a message being queued, if it takes a full second to reach its destination, the direct method call promise was not broken.

Of course, if the promise that message queue does is looser than the one for direct method calls, nothing prevents me from having very fast message dispatchs in the same address space. It's just that I drew a line between methods and messages.

I suppose you could support thread unsafe components by putting them in their own thread and using them through messages, but it won't be transparent (you have to use messages rather than direct calls, but this is good, because really, the single threaded component might be busy, which would delay the call, so it doesn't fit the O(1) promise of direct method calls).

So basically, I'm "taking the difference seriously" (section 7 of the note). About that note, I'd mention that I disagree with its last section, "middle ground", where it says that objects in another process on the same host could be acceptable. I think this is just as worse (the other process could segfault, the equivalent of a remote machine going down) without much of the advantages. I say they are accessed through messages too, and then you can pick between in-process, in-process but in another thread, in another process or on another host completely, without restriction.

jbucata: Interesting parallel that you pointed out.

I'm actually on the side of the people who think that X Window isn't doing so bad, except maybe that there should have been (or be!) more effort into optimizing the (now common) case of local connections. You have the correct parallel, but you are positioning XPLC incorrectly.

Say that we complete the picture a bit more by adding the Windows GDI and Fresco (formely known as Berlin). In the former, everything is local and there is no remoting possible, and in the latter, everything is accessed through CORBA. You could say that things like the Apache modules are like the Windows GDI, all specific, hardcoded and hardly remotable, and Fresco is like, well, CORBA. The thing is, both are right in what features are needed, but the compromise (or lack thereof) that they made is what is actually questionable.

Enter the X Window System. Instead of using a generic and transparent remoting system like Sun RPC, they designed a precise and rather simple interfaces embodied as an (relatively) efficient protocol. People may find this discutable, but between re-implementing the Win32 or an X server from scratch, I'd rather take the X server (case in point, WINE is struggling after all these years, while we have seen X servers come and go many times in the same period of time).

Note how C doesn't allow for remote procedure calls, but Xlib manages to have them anyway in an efficient way, without too awful of an API (compare with COM/DCOM's HRESULTs, for example). CORBA makes better APIs, but at the cost of layers upon layers of management code, which adds up for awful lot of overhead. XPLC is a bit like C and C++, it has components, doesn't have transparent remoting, but remoting can be done too, perharps even in a nicer, more robust and more efficient way. It just won't be transparent, maybe only translucent, but that should be good enough. XPLC itself won't have anything in it related to remoting, but I intend on having a framework to support remoting available separately (a bit like DCOM is built over COM).

Also, CORBA may be nice, but it deals mainly with objects, and rather heavyweight objects at that. While component systems in the past have managed to get good results without making most objects a component, the ideal component world would be to have everything be a user-replaceable component (I understand that this is rather extreme, but bear with me, this is just something we should aim for). The fact is, you could take many normal C++ applications and 99% of the classes into XPLC components without too significant overhead. You probably don't want to do that in such a non-discerning manner, but you wouldn't die from it. On the other hand, doing the same thing with CORBA, you'll probably die of hunger waiting for your application to do anything at all.

The whole point is that being easy to use, low speed and memory overhead, language and platform neutral should leave no excuse for applications not to be and/or use components.

24 Nov 2002 (updated 24 Nov 2002 at 08:59 UTC) »

Whew, I'm on a roll. I have been shooting down design problems faster than I can hack them in code. I'm also decided at getting serious for a talk at the OLS 2003 about pervasive component-driven software (where the heck is it?) and XPLC, of course.

I have been thinking about the "why" a lot lately, and I just feel that there are only a few key features that add most of the overhead seen in current component sytems, either in ease of use or in actual performance, and that makes people give up on them or not think about it at all. Distributed objects have hardly anything to do with components (heck, objects themselves could be done without, as PILS and others shows), but its a sexy feature, so there it goes, dragging everything else to hell with it, even if nobody really needs it.

Come on, think about what widely deployed distributed applications using "transparent" RPC and its ilk exist. There's NFS, NIS and SMB, all great favorites, right?

So off developers go, going without components if they can, and rolling their own specific solutions if they can't (think of Apache modules (particularly in 2.0), Netscape plugins and all these others). Guys, if you want to compromise a bit about running your objects halfway across the planet, you could use components for almost everything...


Went and hit the SourceForge compile farm. XPLC compiles and runs fine on sparc64-unknown-linux, i686-pc-linux, i386-unknown-freebsd4.7, alphaev67-unknown-linux, armv4l-unknown-linux, with at most some minor glitches.

Red Hat is annoying me a bit, because a workaround for a bug in gcc is now unneeded, and I'll have to write an autoconf test for that. No biggie, but it won't compile because of that.

Mac OS X and Solaris are giving me more trouble. Both seem to have different way of handling sonames (maybe they don't support them at all, but I think I saw stuff related to that in man pages), and Mac OS X annoyingly doesn't support __attribute((weak)), but I don't see any way I could test for that or make it good. I might have to lose that weak reference trick, but that would be a shame, it is really nice when it works.


I wrote this as I was waiting for my two previous years taxes to get done. Can you believe this? I forgot to do my taxes! To my defense, this kinda happened through my previous employers HR department massive lossage.

What happened was that my given salary was X, but with things like extra for standby and overtime, my actual taxable income would reliably be something like 80% higher. That sounded nice, until I found out that the amount that they kept aside for the taxes was based on X rather than X * 1.8! Also, when I asked them to retain money for RRSPs (Canada's equivalent of 401k I think), which would save me taxes, they told me it was okay without actually doing it. ARGH!

But it doesn't make me less of an irresponsible person.

Tribes 2 (gaming)

Boy, was it a long time that I played that game! A few days ago, I found out that there was a new patch for the game, so I updated. Wow! Not only did they fix the crashing bug in the community browser, but they added the one feature I always though was missing, some audio hit confirmation (now, like Quake 3, it goes "plink" when you hit someone)! So it is now the perfect team game. :-)

If that wasn't enough, I found out that NecroBones, my favorite server, is back up! Well, that made my day. I gotta find a bit of time to go there and play a bit with the old buddies...


XPLC is getting growth hormones it seems. It's getting better and better, I'll have to make a new release soon!

72 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!