Older blog entries for ahosey (starting at number 32)

jlf and I just had an interesting conversation. We were talking about the parallels people often try to draw between electrical engineering and software engineering. i.e. there is the notion that EEs can take a packaged IC off of a shelf and just use it, knowing the inputs and knowing the outputs. The parallel idea in software is often discussed as "reusable software." Related to that, I remember Rob Pike saying something about how 80-90% of the work he did on Plan9 was just "compatibility" stuff - POSIX, TCP, etc. So... what if we were all able to download a POSIX library and just use it in our work, a la pulling an IC off a shelf?

The language <=> language problem is not a big issue, it's been solved many times many ways. (One) problem with the idea not often discussed is that software, and software engineers, have to deal with all the different hardware architectures and operating system platforms out there. This is where the software engineering <=> electrical engineering analogy breaks down. Imagine you're an EE and the laws of physics are slightly different every time you start a new job. Now those pre-packaged ICs are less useful, yeah? That's the difficultly facing the issue of reusable software. Ultimately any piece of software has to run on some piece of hardware. There are plenty of POSIX-compatibility libraries out there, but in order to be used they all have to run on an appropriate hardware/OS platform. That's why there are so many of those libraries!

So what's to be done? Right now, and I really stress the "right now," the great leveller we have is the network. So imagine there was a big machine out there on the Internet which exported the entire POSIX API via some form of RPC. Now anyone on any platform in any language really would have access to a reusable POSIX implementation... Erm, provided their computer is attached to the network. For that reason (and others) POSIX via RPC is not really a serious solution, but I wanted to throw out that example in the search for truly reusable software.

Just to be clear, my ranting about Red Carpet was a happy rant. I really like it. I found a couple nits to pick, of course, which I will probably put in the bug tracker as suggestions.

dan, schoen, and cdent have been making interesting and I think insightful comments about the stock market and some business practices the world seems to take as given. I think the relentless push for growth and expansion by corporations is ultimately driven by the stock market. The corporations have to point to continued growth in order to keep their shareholders happy and keep their market cap up. Why isn't it enough to just meet payroll and put just a little extra in the bank? A few months ago I read in Business Week that the big telcos are starting to get nervous because their shareholders are selling. Why are the shareholders selling? Because these big telcos are posting profits of $5 billion instead of $10 billion. Not revenue, profit. And that's not enough. No kidding!

I haven't yet worked for a public company and I hope I am never in a position where I have to. I feel kind of guilty about holding a 401k but that's a more complicated issue. I want to have a retirement fund, I don't want to be a burden on my children. But in order to keep pace with inflation money must be "invested" and any time money is "invested" in any fashion it is eventually reaching someone who is expected to provide "return on investment." Which is fine, the privately-held company where I work is the same principle - they give us paychecks and we as a group are expected to create revenue which meets or hopefully exceeds expenses. The problem I have with the stock market is an issue of scale. It seems that among publicly-held companies "return on investment" has finally evolved to mean "astronomical profits achieved by draconian means." I'd rather my money not be a part of that.

24 Nov 2000 (updated 24 Nov 2000 at 03:29 UTC) »

From Closing with the Enemy by Michael Doubler:

"On 16 September Montbarey's besiegers literally hammered, burned, and blasted the fortress into submission. The Churchill [flamethrower tanks] delivered the opening blows by circling the fortress while scorching it with plumes of flame. Montbarey's walls had proved impervious to direct and indirect fire, so the battalion commander decided to smash his way inside through the main gate. A TD [tank destroyer] rolled forward and fired fifty rounds at point-blank range in an attempt to open the entrance. However, the Germans had reinforced the gate with scrap metal, rocks, and heavy debris, and it refused to budge. The TD's unsuccessful efforts prompted the battalion commander to send for a 105-mm howitzer, and in a scene similar to the siege operations of eighteenth-century warfare, soldiers brought the gun forward and placed its muzzle twenty yards from the main gate. The cannon hammered the gate partially open with more than fifteen rounds and then sent numerous high explosive and white phosphorous shells slamming into the inner courtyard."

Wow. That would have been something to see, huh? Must have been a hell of a door.

After last night's diary entry I put my money where my mouth is and also made a mailing list for mod_extract_forwarded. I'd been putting that off for a while. I hate ezmlm. I despise qmail.

I posted an announcement to Freshmeat with nothing but the mailing list annoucement. Curious to see if that goes thru.

After some experience with publicizing mod_extract_forwarded as "my own" little piece of free software, I decided that every free software project no matter how big or how small should have a mailing list. I decided this after someone sent me some mail and we collaborated on a few pretty important bug fixes. That's when I realized, I knew other people had downloaded the module but I didn't know who or if they were using it. If they were, they really needed these fixes but I had no vector of communication other than Freshmeat.

The natural impulse might be to think that smaller programs don't merit the overhead of a mailing list, but each project should generate an amount of mail proportional to its complexity and popularity so it all works out. Conversely you should be on the mailing list for every piece of free software you use even occasionally, and the amount of mail received per list will naturally balance out. No project list is too small to subscribe to, in fact it's the big ones that are going to flood you with mail. To put it another way, it doesn't matter how many lists you're on, just how much mail they each generate.

So to make the point here, I made a mailing list for xload-snmp when I first put it up. The list stayed empty even though I got some downloads on the software. But after a few months, someone out of the blue sent mail to the list saying "hey it no workie for me." Because of that conversation I spent a couple days on the code adding a few more things really needed to make it useful for more than just Unix load monitoring. So now I feel that my mailing list philosophy has been validated.

I was also surprised by the substantial gains in functionality I got from just a few lines of code. I also understand now why everyone hates Xt. I thought X resources were hairy as an administrator, the API on the programming side is, if anything, worse.

17 Nov 2000 (updated 17 Nov 2000 at 06:40 UTC) »
WARNING for you academic Advos, sysadmin geek talk below Skip it

Waldo: put Windows on first, then Redhat, and be sure to tell Redhat's LILO setup screen about your Windows partition. (I think the newer Redhats actively look for Windows installs on the disk.) If the installer sees the Windows partition it should set you up with a LILO stanza that will do the job. If for some reason you must have Redhat on first, the Windows install will kill LILO when it installs its own MBR. Boot from a rescue disk, mount the Linux partition, and rerun lilo with the "-r" flag.

Or wait, are you dealing with multiple hard drives? Ah, that is sufficiently more complex. Contact me.

cdent: when the restlessness overcomes the intertia, I shall go. The vaguely dissatisfied rumblings I occasionally air out are the seeds of that restlessness. And the hostile takeover may be closer than you think...

Kernels are fairly low level pieces of code, in which the common code paths are traversed a lot. There is probably plenty of justification there for micro-optimization. As you move up the "code ladder" i.e. kernel, libraries, servers, apps[1] - optimization gets less emphasis. It seems to be an article of faith among app programmers that "the hardware is fast enough." Visual Basic perhaps being the modern culmination of this attitude. I recently read Software Runaways by Robert Glass and some of the articles collected there suggest that faith is misplaced. Software written with 4GLs took minutes to do what the old "antiquated" software did in seconds. Not that 4GLs are inherently bad, but in these cases they were misapplied.

Maybe I will invest in some Knuth books for my collection, that should get me back into algorithms and math. (Not that Knuth, the other Knuth.) I still have my Lewis and Denenberg book on data structures (excellent book) and sometimes I pull it out and read some of the proofs or the algorithms. One problem with college is that the students, willing as they may be, usually don't have the perspective to really grok the lesson. Now that I've been doing this for a few years the books make a lot more sense.

I did a Google search for "discrete structures" and found this page with some interesting maxims. My favorite is this one because, oddly enough, it is under the heading "General Operating Principles"

1.There is no general methodology for solving a problem.


[1] Strictly speaking, servers are applications but are a special case in my little taxonomy because they mostly interact with other software, not people, so their performance demands are somewhere between libraries and so-called end-user apps.

barryp: Logarithms... it's starting to come back to me now. Maybe I'm not just stupid, maybe I'm just out of practice. I'm reading back thru schoen's diary now to try and limber my brain.

Using the integer coefficient vs. the float coefficient would be a decision of "fast fit" vs. "best fit". Integer multiplication would be a little faster but you'd waste a few more bytes in some circumstances. Space vs. time, the classic tradeoff in computerized problem solving. Of course that decision is a "micro-optimization" anyway - unless n is really freaking big - and micro-optimizations are mostly frowned upon these days. On the other hand Musashi said that the true tests of skill are "the small, fine works."

I've found that most of the programming one does as a sysadmin doesn't strain the "advanced" fields of programming theory and it doesnt't require any math at all. There's a lot of "glue" programming, which often requires one to be clever but not very deep. And in my case I do a lot of patches to free software, which makes me feel good but again these are very small hacks. There is one large piece of software done for the company in which I am applying increasingly sophisticated techniques. And it's working out well which makes me feel good about being able to use that knowledge, but I'd like to do that more often than this job really requires. I mean I could structure my solutions in ways that would require a lot of custom code but that's hardly good business. It's better in the long run to take established (free software) tools, shim them together with a little code where needed, and set it loose. That's why I love well structured modular softwares (like Apache) cause they readily lend themselves to that sort of approach.

Wow. Three top notch reponses to my little theorem. Props to barryp, Pseudonym, and especially schoen. In case anyone was wondering I wasn't trying to get you to do my homework. I haven't done homework in years... perhaps if I were doing homework my math wouldn't be so rusty.

I took a lot of math classes in college, some calculus, and statistics, and discrete math thru the computer science department. I did well in most of them, but when it comes to real world application my math skills seem so feeble. It's like I know enough math to want to prove my ideas but not enough math to actually get the job done.

For instance, how did you guys know log(256^n) was the way to get the maximum number of symbols in an encoding? In my first attempts at a proof I had a function maxlength(n) but I didn't know how to break that function into mathematical elements to reach a QED.

14 Nov 2000 (updated 14 Nov 2000 at 02:21 UTC) »

Assume 8-bit bytes and strings composed of 1 byte per character. For an unsigned integer stored in n bytes prove:

(a) 3n bytes will always provide enough space to hold the decimal string representation of the integer.

(b) 3 is the smallest coefficient to provide enough storage for all values of n.

I know this to be true but proving (a) is harder than I thought.

9 Nov 2000 (updated 9 Nov 2000 at 19:33 UTC) »
ajv: I totally agree with your ideas about the "ramp up" time for a free software contributor, and that good architectures reduce ramp up time and make contributing more enjoyable, which vitalizes the project. For example I was able to make my first modifications to sawfish in less than an hour from the first time I looked at the source code.

I have, tucked away, an unfinished essay on pluggable scriptable software, with about half of it devoted to the social (or survival) advantages of such software in the free software world. I don't claim credit for the seed of the idea, I was just trying to flesh out some things that I heard Jim Blandy mention one time.

Chris keeps bugging me to post up the essay even though it's not complete. Maybe I'll do that.

Then sej said:

1) all the external documentation and design documents were no substitute for the direct reality of inspecting the architecture in a debugger. They served as a map for the territory, but were not the territory by any stretch of imagination. A large dose of experimentation, reasoning, and inter-programmer dialogue was required to build up a more detailed understanding of how things actually worked.

This is true. To return to my sawfish example, it was extremely useful to me to have sawfish-client available to examine the data structures of the running window manager, and interact with the window manager in real time and see my results either on the screen or by examining the internals of sawfish using sawfish-client. It was the excellent design of sawfish, combined with the excellent debugger/interpreter, combined with the excellent API documentation, that made working on sawfish easy and fun.

Interpreted languages usually have an advantage over compiled languages here, or at least they do if the language provides some form of eval(). eval() makes available the entire scope of the language from the debugger instead of just limited calls to API functions. This allows richer interaction with the running program.

23 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!