Older blog entries for pphaneuf (starting at number 63)

Jeff Darcy has replied on his weblog to an email I had sent to David McCusker, which the latter had put up on his weblog. Whew! Here's a continuation of this discussion.

The performance of smaller/simpler instructions is usually worse than the equivalent CISCy code (and I'm thinking of a much bigger granularity than CISC) because the instruction decoding overhead becomes a bigger percentage of the equation, everything else being equal.

Jeff asks why we should be worrying about the performance of interpreting code, when incremental compilation of those virtual instructions is the right thing to do when performance is what you want. He makes a good point about the level of abstraction and expressiveness being the important things, but I would point out that the in-memory representation of a Perl program plus the memory used by the native code implementation of these instructions is often smaller (and more cache friendly) than the fully spelled out native versions, and the quality of the virtual instructions implementations optimization is probably better than what an incremental compiler can do. In fact, the Perl interpreter has put all of the virtual instruction implementations in a single compilation unit (and they also tweaked the order of these inside that unit), so that they are in a contiguous area of memory. All of this adds up to code that's almost as fast as native, optimized C++ can be.

They can also be loaded from disk more quickly (though Perl itself doesn't take much advantage of this, as it compiles the code every single time).

For reference, the test program in "The Practice of Programming" (Kernighan/Pike) is a Markov chain generator. The timings and lines of codes were as follow (the first timing is Pentium II 400/Windows NT/VC++ and the second is MIPS R10000 250/Irix/can't remember the compiler):

  • Plain C: 0.30/0.36 seconds, 150 lines of code
  • Perl: 1.0/1.8 seconds, 18 lines of code
  • C++/STL: 1.5/1.7 seconds, 70 lines of code
  • AWK: 2.1/2.2 seconds, 20 lines of code
  • Java: 9.2/4.2 seconds, 105 lines of code

Perl is even faster than C++/STL! I wonder how good of a job would a JIT compiler do with this? Then again, it is possible that the Java version did use a JIT compiler, and it didn't do it much good if it is the case!

Thanks again to raph for pointing me at very interesting stuff. This time, I'm talking about this post by Jeff Darcy. I am experimenting with things he talks about, namely supporting both single- and multi-threaded, but starting from the single-threaded side, and queueing at the dispatcher's discretion.

I do wonder why he's saying that making timeouts (or any kind of time-based event) queueable events is a bad idea. One reason I like it is that it makes debugging and testing easier (one can record a stream of events and play them back quickly).

I have to see how Xlib event buffering works out regarding not calling its functions and just watching for ConnectionNumber(dpy)'s readability using (doing an XFlush first, of course!). I'm afraid that Xlib reads events from the stream as often as possible, so that I'll have to process X events for as long as XPending (?) is non-zero, so that it doesn't suck up newly appeared events from the stream while processing others (say XPending tells me there are two events, but while processing the first, a third one arrives, getting the second might get the third one from the stream, making the stream non-readable, thus making me not check for events). This is annoying, because getting many X events could make my loop starve out the other kinds of events.

Maybe I need some of that XCB stuff, but on the other hand, my problem also involves Qt, so it might be too much to ask.

If anyones has any idea of what I'm talking about and would have a hint, go ahead and mail me!

Ottawa Linux Symposium

Things have been going well at the OLS. I went and saw that PILS talk, but then, I still can't figure out how to get more information on the net about it. I'll ask Alan Robertson tomorrow. It doesn't seem to do much, is rather limited and doesn't seem to be that way in order to reach some anti-bloat goals or anything, just because it doesn't have much. You could probably take XPCOM and do everything it does now and almost everything on that project future directions, and even more (it doesn't support loading modules from multiple directories, for example). XPLC will soon be also on par with it.

Hugh Daniels always makes interesting talks, or at least, is an interesting speaker! :-)

The asynchronous I/O talk was excellent, but I was a bit disappointed by a few things though.

Adding a flag for having real non-blocking I/O on disk-based file descriptors was deemed pointless and that we should rather use AIO. I see things differently, as for example Squid (I think) behaves like if disk I/O did not block, like network I/O, and having it open files with a flag that would make it really non-blocking would improve performance on platforms supporting such a feature, and would be very easy to just skip on other platforms, keeping the current architecture. Some platforms would just go faster if they have the feature, which you could check with a single autoconf test.

Ben LaHaise's AIO is actually the implementation of the really good interface for event notification that Linus talked about a few years ago. Apparently that it even has the callback interface and all, but people don't want to use that callback interface and just want to call the syscalls directly, all by themselves (or closely). Fools! I talked this over with apenwarr, and while he thought that if users of this stuff would be well-mannered and followed some conventions (give any fd you might want to wait on, like ConnectionNumber in Xlib can give you), it would be okay, but my point of view is that with callbacks, you can enforce good manners. You register your interest in a file descriptor, and when an event occurs, you get notified, no matter what, even if someone else called the get_event method (or whatever it is named). Whoever is out there using Ben's AIO, use the callback interface, please!

Also, there is a hole I didn't worry much about at the time, and it is support for timers. Either you need a way to add a timer to the AIO stuff (and get notified/called back when it's ready), or a way to make a timer out of a file descriptor. Just one idea: a /dev/timer that you open, write a timeout into, and it becomes readable after the timeout expired.

I'm writing this while sitting beside the FreeS/WAN people, and I find it interesting how sometimes, some key people are just missing some part of the picture, and talking with some other people fixes their view of the world. They are obviously slightly confused (ask dcoombs!), but thankfully, some people are clued in. Unfortunately, one (Hugh Daniels, again) is shut up by his stupid government (guess which one).

XPLC

I'm hacking on XPLC here at the OLS, turning the multi-binaries testing suite into a single binary, and I hit a bug. Go unit testing! I have found a problem where the single instance of the service manager dies off, and the pointer that XPLC::getServiceManager goes bad, and everything goes to hell.

I could use weak references here. The service manager currently doesn't have anything in it to enforce it being a singleton, and that is good, I like it, but now, I'll have to turn it into an enforced singleton just so that this doesn't die.

RFC: So, are weak references should go in or not?

An alternatively implementation of the X stream protocol

Well, thanks again to raph for the link to XCB. This is pretty similar to XPLC, in the sense that in order to achieve goals of lightness and performance, they sacrified some other features that might be some other people's sacred cows, but that (if only those people would just open their eyes) aren't really needed.

raph: Thanks for your last diary entry, it was really interesting and full of good links to other interesting things full of good links.

This is in fact of the things I want to use XPLC for, building small, high performance servers, mostly event-driven, using message passing (similar to what David McCusker describes here) between processes for scaling on SMP systems. XPLC comes in when I want to make things even smaller and faster by putting multiple servers in the same single-threaded event-driven process, sharing as much stuff as possible. For example, you could build a server with HTTP, SMTP, IMAP and POP3, all sharing common information, including HTTP giving you a webmail interface.

I still can't believe that we can't publish a medium-low traffic interactive website on an inexpensive single process box. All you see is dual this and cluster that, with RAM going around in gigabyte slices, and still getting a slow site in the end. Try this little experiment with your web server: open up a thousand connections, but don't do anything with them, just keep a thousand idle connections open to your web server, then access one page (one thousand completely inactive users and a single active user). Watch your server crawl to deliver a single 8K file to the lone active user. Something is wrong.

I'm taking goingware's advice and testing this high-performance single-process component-based server stuff on my 486DX4/120 with 80 megs of RAM.

11 Jun 2002 (updated 11 Jun 2002 at 15:42 UTC) »
raph : I have found that it is hard to interest people in simpler technology. For example, my XPLC project, a component system, frequently get negative impressions because it doesn't support distributed (out of process) objects, method interceptions or a slew of other features that systems like COM/DCOM/.NET or CORBA have.

The funny thing is that the point of XPLC is that it doesn't have that ("XPLC" stands for "Cross-Platform Lightweight Components"), on the account that simpler is better, and that we'd see more component-based software if the overhead of turning a library into a component wouldn't be orders of magnitude worse than a regular library. XPLC doesn't do out-of-process objects or method interception so that it has bounded method invocation overhead (C++ virtual call level of performance), it doesn't have many other features so that it is really easy to make an application that uses components or to make new components.

There is also a similar phenomenon with multithreading, where making an application (often a network I/O-bound one) multithreaded instantly makes it "better". As alan said: "A computer is a state machine. Threads are for people who can't program state machines."

It seems that people want technology that has sex appeal, a simple technology that do what they want easily and quickly just isn't enough.

But nobody can tell me of a really popular distributed application beside the Sun NFS/NIS family.

XPLC

A new release, on a converging path toward a generic dynamic loader...

Life

The OLS starts on the 26th, which is the day right after my significant other's birthday, for which I absolutely want to be there. And there is this PILS talk about pluggable components right at the very beginning. I'll have to race to Ottawa on that morning it seems!

28 May 2002 (updated 28 May 2002 at 22:26 UTC) »
XPLC

I really wonder what this PILS thingy that will be talked about at the OLS is. From what I can tell, it's pretty close to XPLC (save for the fact that automake is crazy). We'll see.

Star Wars

I went to see Attack of the Clones with some friends the other day.

When I saw the scene where they have they giant sheeps and Padmé is walking down the hill, I started singing "the hills are alive with the sound of music", not so quietly. Pretty much everyone within earshot of me in the theater laughed.

Ottawa Linux Symposium

I have been going more and more to work on bike, and I just noticed that the OLS is going to have a "Hacker Bike Ride" at the end. This is nice!

Modern C++ Design

Pretty good book. This guy is going to go crazy soon, doing template magics the way he's doing so.

Please reply by e-mail, thanks!

I played the Atari 2600 Space War this weekend on an Intellivision with the Atari 2600 attachment. This is such a good game!

I tried finding a good one for Linux, but the only one I found, KSpaceDuel, is way too complicated and it got a few gameplay issues compared to the original one (the energy and damages are displayed as numbers rather than some visual widget, please!).

If you know of a good Space War for Linux, write me at pp@ludusdesign.com.

XPLC

Got a new release prepared, with monikers and other things coming together. I'm actually getting paid to work on this now, which I find really cool (thanks apenwarr!).

XPLC

This is really weird. In my last diary entry, I was wondering where all of the people that wanted a lightweight component system went when I gave them XPLC. Maybe they went to Korelib, which is so close to XPLC (and also seemed to appear 8 month after XPLC came out) that I wonder if they didn't read my article and went their way? I just fired off an e-mail to the project lead.

It's pretty nice, from what I can see, doing things slightly differently: very Qt/KDE oriented (but still independent), more complicated to use (but is more featureful than the current XPLC release) and uses strings instead of UUIDs for identifying components (potential for collisions, but easier to use), etc...

54 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!