Older blog entries for apm (starting at number 24)

I've been doing a little reading on Perl 6 (the new version of Perl currently being designed) and Parrot, the new virtual machine (VM) for Perl and possibly other languages.

You can read some of Larry Wall's (Perl's inventor) thoughts on Perl 6 in his "Apocalypses" at http://dev.perl.ogr/perl6 (along with some material on Parrot). Larry and the Perl people are a smart bunch. But I wonder if they're too smart. Maybe I'm wrong, but they seem to be making Perl so complex that the vast majority of users won't have a clue what they're talking about. I think C++ suffers from this as well. Brilliant hacks, but often beyond us mere mortals. It's easy for the language designer to say "you can just use what you need". But what happens when you have to read (and understand!) someone else's code, that uses a different "subset" than what you know.

Luckily, I'm not brilliant enough to have this problem with Suneido. Although it's all relative and I guess some people would say I have already fallen into this trap. :-)

One interesting thing about Parrot is that it's register based rather than stack based like most other VM's (e.g. Suneido, Java, Smalltalk). This point was brought up in the Advogato discussion of the Suneido language (http://www.advogato.org/article/209.html) which referenced a paper saying registers are better than stack (http://cm.bell-labs.com/who/rob/hotchips.html). Interestingly, this paper claims reference counting garbage collection is better, whereas Parrot claims the opposite.

Parrot claims it will be general purpose, not limited to running Perl. Maybe a future version of Suneido should use Parrot?

Other interesting points from Perl 6:

  • return, break, continue are "exceptions', similar to how they are implemented in Suneido blocks
  • More "statements" are expressions, e.g. a switch is expression that has a result value. I've been asked why Suneido didn't make all/more statements expressions.
  • Perl 6 is considering .name as shorthand for self.name just as it's shorthand for this.name in Suneido
  • You can put "catch" in any "block" including a function body, "try" isn't required. I like this.
  • do { ... } is an expression that "returns" the result of executing the block
  • Although it was proposed that you shouldn't have to "declare" lexically scoped variables, Larry is against this, saying it's too confusing otherwise. But if, as most people recommend, you keep all functions/methods small, then I don't think this argument holds.

But my biggest lasting impression is that Perl is getting very big and very complex. You can change the compiler's parsing grammar on the fly for gosh sakes. How are you going to understand code where you don't even know the syntax without looking at how the writer modified the language?!

Wow, it's been a long time since I posted an entry here. Too busy to ramble on about myself I guess. Or maybe I don't find myself interesting enough to ramble about.

Still plugging away on Suneido. I'm happy enough with how it's going, although I wish there were more hours in the day. My to-do list continues to grow much faster than my done list.

Still, with the help of a few contributors, we're close to porting the server side of Suneido to Linux. It now compiles with GCC and successfully runs all the built-in tests. But this is with MinGW, still under Windows.

I'm ashamed to say I don't even have a Linux system setup (work is all Windows). But obviously, to complete the port, I'm going to need one. So, the age old question of which distribution. I ended up thinking I'd use Redhat, because it's the most common (or at least that's my impression). So I started downloading the latest version. Two days later it's still downloading. I don't know if this is a busy time or if they're always this overloaded. Or, cynically, maybe they prefer slow downloads so people will buy the packaged product. Although, as far as I can tell, no one around here carries out. So while I'm waiting for the download, I thought I'd try Debian, partly because they had a minimal download which let you download the rest later, as required.

First, of course, I had to partition my hard drive. Strangely, there don't seem to be any free non-destructive partitioning programs that handle NTFS. So I forked over the money to upgrade an old version of Partition Magic.

Unfortunately, Debian's install is not quite as "automatic" as some of the others. My first problem was the network card. I picked what I thought was the right driver and it seemed to recognize it, but it didn't work. I ended up installing a copy of Mandrake I had around to see if it would figure out the network card. It did, so I went back to Debian and picked the same driver and it worked.

I probably would have just stuck with Mandrake, at least temporarily, but it kept hanging up - judging by the state of the screen when it hung, I'd guess some problem with video drivers. What happened to uncrashable Linux? So, back to Debian. Next step was to install some more packages. Luckily, downloads from the Debian site were much faster than Redhat. The next hurdle was getting X configured. With basically zero knowledge, this was a challenge. I have a Dell computer with what I assumed was a fairly standard ATI Rage 128 video card. However, there appear to be dozens of version of this card. Eventually, after a 12 hour day, I gave up and went home. I couldn't even get it to work in standard VGA mode, which I'm sure is my own stupidity, but no less frustrating.

Oh well, I'm getting caught up on Linux, the hard way (which is often the best way). Hopefully my Redhat download will complete eventually and I'll give that a try. Or maybe if I get motivated I'll fight with X configuration on Debian.

Meanwhile, it's back to threading issues in Suneido.

I did some more fooling around with ACE, but it looks like it's too large and interdependant to try to use just the pieces I need. Too bad, overall I'm pretty impressed by it. After reading the ACE book I picked up Pattern Oriented Software Architecture - Volume 2 - Patterns for Concurrent and Networked Objects (POSA2). It covers a lot of the same things as ACE but from a more abstract perspective.

I've been prototyping a new design for the client-server network code in Suneido based on the ideas from ACE and POSA2. It looks like it'll be an improvement over the current code - cleaner, faster, and more portable.

Been "strongly encouraging" pair programming with my programmers. I don't think they're all totally sold on the idea - I'm not even sure I'm 100% convinced. But I have to say that the more we do it, the more I'm sold on it.

jhill - thanks for certifying me as master - I appreciate the gesture. :-)

Picked up the new C++ Network Programming Volume 1 - Mastering Complexity with ACE and Patterns by Schmidt and Huston yesterday. I knew about ACE but I'd never really looked into it too closely. Now I think I might. I'm always much happier to use a third party library if I know about the "philosophy" behind it. I'm always pretty cautious about adding dependencies on third party software to Suneido. It really is a "dependency" - if they have bugs or don't support what you need you can be in trouble. Even if it's open source, that doesn't mean you've got the time or understanding to fix it yourself. But at the same time, leveraging off other people's work can have major benefits. Suneido uses Scintilla for source code editing. We could have rolled our own, but we got a lot nicer component for a lot less effort. Of course, something like that is not too serious a dependency - it could be replaced fairly easily. If I rewrote Suneido's networking to use ACE, that wouldn't be so easy to change. My other concern is size. ACE is large! I don't want to double the size of Suneido just to get a little better networking! I'll have to play around and see what kind of size increase it would entail.

The other nice part about getting a "good" computer book is that it tends to really motivate me, not just in the area of the book, but all over. Maybe because it's "inspiring" to read about other people doing a really excellent job. It makes you want to raise your own standards, do better work, improve stuff. I started thinking about multi-threading Suneido's database server - something I've always been nervous about. And I thrashed out how to clean up the database history mechanism to be cleaner and more complete.

I spent a little time playing with VMware the other day. I really like the idea of being able to run multiple operating systems without re-booting. It would be really useful for testing. And it would let me run Linux as well as Windows. (Too much of my work is Windows based to run Linux most of the time.) In fact when I recently got a new computer I deliberately put more memory (512mb) and more disk (80gb) in order to have space for VMware. But ... as soon as I installed it, Windows XP started to act flaky. Strange delays, networking problems, video problems. All intermittent and unrepeatable. So, much as I like the idea, VMware had to go - Windows is flaky enough as it is. Besides, how much use would it be for testing if it's flaky? Who knows what the source of the problems was, but I don't have the time to try to track it down. Maybe it isn't fully compatible with Windows XP? Or my hardware? Another aspect that scares me away is that there doesn't seem to be any way to "turn it off". Even when you're not using it, it's got all kinds of stuff running - yuck! I wonder if using Linux as the host OS would be better? Maybe I'll give that a try at some point. It's too bad, the idea is cool.

I wonder if I'll ever make Master on Advogato? Without sounding too egotistical, I think my experience, expertise, and investment in open source is as great as many of the other people rated as Master. But I don't know anyone "important", and I'm not working on a highly visible project. (I know that "major project" is part of the criteria, but many of the masters don't seem to be working on "major" projects.) I guess in a way it's a weakness of the rating system - people have to know you. Of course, it doesn't really matter - it is just an ego thing - everyone likes recognition :-)

Enough thinking out loud - time to do a little work!

Got new computers at work and at home, nothing too special - Dell 1.6ghz P4's. They came with XP - so far it seems okay. I'm afraid I find it a little unnerving when it does things like set up new hardware without telling me. What was that Niven quote? Something like "any sufficiently advanced technology is indistinguishable from magic". I put 512mb of memory and an 80gb hard drive in my work machine - I'm planning to put VMware on it so I can test Suneido on different versions of Windows (and also run Linux). I still have some issues that only seem to surface on Win98 (which also happens to be the most common version right now, unfortunately). I only put 256mb and a 20gb drive in the home machine. I'd tell you these numbers boggle my mind but that would date me.

I also finally took the plunge and installed a Linksys wireless access point switch to connect my notebook and desktop at home. Only just installed it last night, but it seems to work pretty slick. Finally I can access the internet and print from couch :-) Of course, one of the reasons I prefered to use my notebook at home was because it was faster than the desktop - that situation is now reversed.

I've been pretty frustrated lately because I don't seem to be spending any time on "real" work. I've been trying to hire a programmer and a customer support person and we got a big pile of resumes this time. Even after weeding out the obvious rejects, we still had about 40 interviews to do. Pretty hard to get anything done in between interviews. And of course, switching computers at work and home is also time consuming. Hopefully things will settle down a bit in the next week or two!

Finally got around to taking a look at Eclipse - the general purpose IDE that IBM released open source. Read a bunch of the articles on the site. There are some pretty interesting ideas. Some aspects I liked, some I didn't.

  • Although the Eclipse IDE is general purpose, it's written in Java, and to tailor it or extend it you use Java. Since I'm primarily a C++ programmer this is a bit of a negative.

  • They chose to write their own (yet another) portable gui framework (SWT). I thought Java already had a portable gui? I find their approach very reasonable though. It makes me want to work on a portable gui for Suneido along the same lines.

  • There's a good discussion of gui "resource" "disposal", i.e. how and when to free fonts, etc. and why finalization isn't a good approach.

  • Eclipse has an interesting plugin architecture. It seems relatively simple yet flexible, with attention to performance. Plugins can define new "views", add to existing menus and toolbars, and even insert material into the documentation.

It makes me feel that Suneido's IDE is somewhat limited. I think it would definitely be worthwhile to incorporate some of Eclipse's ideas.

I've had fun the last few days getting Suneido to automatically change the mouse cursor to the hourglass wait cursor whenever it's busy. (This is on Windows.) You might think that would be easy, but it's actually fairly tricky. You can read about it on Codeproject.

I've wanted to do this for quite a while but been putting it off because I fully expected it to be frustrating. (And it was!) My thoughts after spending almost two days on this:

  • Why isn't this built into Windows?

  • Why hasn't anyone else done this? (Or if they have, why haven't they published it.)

  • Documentation sucks. None of the critical information I needed was in the documentation.

  • Thank goodness for news groups. I'm not sure I would have solved this without comp.os.ms- windows.programmer.win32

  • Was it worth spending this much time on such a minor detail?

Despite the frustration (or maybe because of it?) I'm still pretty happy with getting this done. It's a little thing, but in many ways I think a good product is the gestalt of a whole bunch of such "little" things.

It sure feels good when you can make a specific change to some software and get a definite improvement!

Suneido uses a cost based optimizer for its database queries. For each operation (e.g. join, union), it estimates the cost of different strategies. The problem is that if you have a complex query with a lot of operations, then the number of possible combinations of strategies becomes very large - a "combinatorial explosion". So query optimization slows down a lot for large queries.

However, I had a feeling there was a lot of redundant work going on recalculating the same costs over and over in different combinations. If this was the case, then caching the calculated costs for each operation could save a lot of work. So I added a simple cache to each operation class, using a linked list. After calculating a cost for a certain strategy I added it to the cache. When asked for the cost of a strategy I checked in the cache first to see if it had already been calculated. This involved 20 or 30 lines of code in the base class for query operations. All the automated tests still ran - a good sign that I hadn't broken anything.

With some simple counters in the code I found that on the automated tests (mostly fairly simple queries) the cache was eliminating about 2/3 of the cost calculations. Not bad, but the real test would be complex queries. One of the more complex queries in our accounting package was taking about 2 seconds to optimize. With the cache it now took about .1 seconds - 20 times faster!

Not bad for about an hour's work! But the best part was that the structure of the code allowed me to make this change easily and locally without disrupting anything else. Yes, I could've included the caching when I originally wrote the code. (If I'd known at that point it was worthwhile.) But I'm a firm believer in doing simple versions first. If you try to include "everything" in the initial version, you'll never get it done, it'll never work, and you'll still have to change it later.

If you're interested, you can see the actual changes to query.h and query.cpp in CVS on SourceForge

Argh!!! I was in the middle of updating the Suneido website when the server started giving errors and I couldn't finish the update. Now stuff is "broken" and I can't fix it! DataPipe hosts our site, and for the most part they've been pretty good, but it's frustrating in this kind of situation to have to wait for their tech support to get around to looking at the problem. I'm tempted to set up our own server but I'm not sure our internet connection is up to it. We're in a research park so the bandwidth is shared. Can't really afford to get our own dedicated connection. Maybe someday.

I was feeling pretty good about things before this happened. The site has been up for over a year, and I've managed to average a release every month. And so far the releases have all been pretty fair quality, IMO.

Progress is always slower than I'd like, but I don't really feel like working more than the 60 or 70 hours a week I already put into it. Suneido has attracted a few regular small-scale contributors, but so far no one else has really got involved in a major way. Not that I had any naive notion that masses of open source developers would immediately flock to the project. "If you build it, they will come." doesn't really apply! There are a zillion open source projects and so far, obviously, Suneido hasn't convinced anyone it's worth major investments of time. Maybe that'll come, and maybe not. In the meantime, I'll keep plugging away. I just hope our website comes back!

Something that's been on my todo list for Suneido for a long time is cleaning up a few places in the client-server network code where there might be bad interaction with the TCP/IP Nagle algorithm. However, I'd never actually done any testing to see if it was a real problem.

I won't try to explain the Nagle algorithm in detail. I'd recommend "Effective TCP/IP Programmer" by Snader if you're interested. But in short, it's a standard technique used to improve the efficiency of TCP/IP. However, it assumes you alternate between send's and receive's, e.g. send a request, receive a response. If you don't follow this pattern, for example, you do two send's e.g. a header and then data, then the Nagle algorithm can slow you down to 200ms per send/receive, or only 5 "messages" per second - pretty slow. One "fix" is to simply disable Nagle, but then you lose the improvements it brings.

So, last night I fixed the code to combine multiple send's and this morning I did some quick benchmarks. On the client side, the places affected were output, update, and delete. Sure enough, with the old code I was only getting 5 outputs per second (ouch!) With the new code I'm getting 2500 outputs per second, or 500 times faster! Not bad for a few hours work! (NOTE: This only affects client-server operation across the network, not standalone use. Standalone I get about 8000 outputs per second.)

It just goes to show that knowledge is a powerful tool. If I hadn't read about this problem it could have been a long time before I figured it out. Of course, if I had any brains, I'd have made this change a long time ago!

15 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!