Older blog entries for titus (starting at number 39)

(It's just work avoidance day, I guess...)

More on Ruby

I became curious about Ruby's approach to threads after reading about Cilk, a neat-looking extension of the C language that provides a system for doing multithreaded parallel programming. (Cilk's magic is in the scheduler, it seems.)

One of the persistent nags about CPython is that the global interpreter lock prevents execution of threads on multiple processors. If Ruby handled it differently (& "better"), maybe we could swipe the idea for Python. Long story short: after a bit of investigation, I infer that the Ruby interpreter isn't thread safe [1], [2]. (FWIW, after having written several C extension modules for Python, I think the way Python handles it is very clean and simple.)

Hmm, now that I think of it, I'm a bit surprised that Dr. Dobbs' troll article didn't trigger any (blog) discussion of the GIL in Python...

Extending Python, GvR, and the benefits of dictatorship

Damien Katz wrote a great story about his experience with the Lotus Notes Formula Engine, and I just wanted to share this quote:

Now you might think that I produced a bunch of design documents and specifications and presented them to the various senior engineers and architects, but I didn't. I remember being surprised by this myself. Even Wai Ki [his boss] didn't have much to say about my design or how it should be implemented. The philosophy was that if I did those things, everyone would meddle with the design and nothing would get done. It's truly easier to ask forgiveness than to ask permission, not to mention things get done a lot faster if you just do them.

I think the application in the context of the decorator and static types kerfuffles is pretty obvious. Even if I disagree with some of GvR's decisions, it's clear that sometimes (often? always?) the vision of a strong dictator is preferable to design by the masses ;).

Still, I must have missed something: on the Amazon Web Services Blog, GvR is caught saying "It may take another generation of programmers to get over the prejudice for static typing." I can't find the reference now (google only does so much when you can't remember any !#%!#%! keywords) but there's someone syndicated on PlanetPython who said, in effect, "GvR just wants to put this stuff in to Python to demonstrate that static typing doesn't matter". Hmmmmmmmmmmmmmmmmmm...

Backing up Advogato diaries

sye asks how my pull-advogato script differs from using wget to pull down the diary.

Maybe I'm missing something, but here goes:

  • One is programmatic, one is not. Since I was using this to try switching to PyBloxsom, with a slightly different post format, I wanted to do some content modification. (I removed the content modification from the version posted.)

  • One pulls down the entire diary, the other pulls down only new entries. (It can easily be changed to pull down only modified entries, but that was less useful than I thought because of the way pybloxsom works.)

  • One requires the use of an XML parser to grok the output, the other does not.

He is, of course, 100% accurate about there being no difference for the purpose of making a backup - although I don't think advogato takes its own XML back in, per se, so you couldn't restore directly from the XML.

ORMs are Object-Relational Mappings

I was unnecessarily gnomic the other day when I was thinking aloud about PostgreSQL and cucumber. Still, someone understood me; Jacob Smullyan also uses PostgreSQL's table inheritance to underlay Python class inheritance, but he does it using PyDO ("use the latest code from CVS", he said). Someday soon I hope to meander through PyDO and SQLObject and steal any good ideas for my own code.

"Testing Darwin"

Discover magazine just published a great article on Avida. Good stuff!


p.s. mirwin -- thanks!

26 Jan 2005 (updated 26 Jan 2005 at 18:11 UTC) »
Poor man's advogato pull script

People seem to be losing diary entries occasionally, so I instituted a backup policy for my advogato diary. Here's the script.

(Kudos to advogato for having an XMLRPC API...)

#! /usr/bin/env python
import xmlrpclib, os, time, stat

server = xmlrpclib.Server("http://www.advogato.org/XMLRPC") n = server.diary.len('titus')

entries = [] for i in range(n): print '... entry', i, created, modified = server.diary.getDates('titus', i) created = time.mktime(time.strptime(created.value, "%Y%m%dT%H:%M:%S")) modified = time.mktime(time.strptime(modified.value, "%Y%m%dT%H:%M:%S"))

filename = '%d.txt' % (i,)

found = 0 try: info = os.stat(filename) found = 1 except OSError: pass if not found: print '--> changed/DNE'

entry = server.diary.get('titus', i) open(filename, 'w').write(entry) os.utime(filename, (created, modified,)) else: print 'unchanged.'

The only tricky part was figuring out how to extracted the created/modified value from the xmlrpc module, which required some digging...


p.s. apologies for the indentation; the actual HTML is properly indented, but it looks like advogato does funny things to it.

25 Jan 2005 (updated 25 Jan 2005 at 19:25 UTC) »

Hey, salmoni, major congrats! I hope to defend RSN, myself...

Rails:Ruby :: Zope:Python

(just five years later ;)

I love hype; back when, I remember seeing people post "I want to write a Web page in Python, what do I use?" and getting "Zope is the answer!" as the response. Given Zope's size, complexity, and overwhelming amount of functionality, I didn't think this was the right direction to point Python newbies. It all seems to have worked out in the end, though: there's now a diverse and healthy population of Web frameworks available for Python, and you can choose what you want. It may be more confusing for newbies than the old "use Zope" advice, but the flip side is that there are now several good options and some healthy competition in the Python Web space, you can have it your way.

Now Ruby on Rails shows up & people freak out, with the Ruby people claiming that "this can't be done in Python" and the Python people saying a variety of polite and impolite things in response. ...and then there are the users complaining about lack of documentation, which was my beef with Zope back in the Dark Ages.

I have yet to see any evidence that Python and Ruby are substantively different, but I haven't looked much beyond c2.com. I tend to like Ian's take on things, and he thinks it's less a language thing than an app/integration thing. It sounds like rewinding the clock by 5 or 10 years and building a leaner, meaner Zope with Python 2.2 would result in the similar advantages for Python: One True App Framework. (It is an interesting (if odd) take on things to say that we would be better off with less choice! Smacks of B&D to me.)

Instead, we're stuck with a multitude of Web/app frameworks in Python, which people seem to think is a problem. Hey, folks, maybe (as with GUIs) there's more than One Right Way to do this? Just a thought...

It is clear that some people just can't take a joke ;).

PostgreSQL 8.0 & ORMs

PostgreSQL 8.0 now supports savepoints. Back when, I argued that any standard persistence framework for Python shouldn't require functionality not available in at least open good open-source SQL database; my main complaint here was that savepoints somehow became an issue in the Persistence-SIG discussions.

It may be time to dust off cucumber, which was developed before Python 2.2, and update it. cucumber is my ORM linking Python to PostgreSQL with class inheritance based on PostgreSQL's table inheritance. I've been using it for 3 years, and one or two other people discovered it & used it for a bit. The only two complaints I've received are that it's slow (well, OK, yeah...) and that it defeats introspection. I think the former is insoluble and the latter could probably be remedied with metaclasses, although I'm not sure.

I'm still quite happy with cucumber, and it's saved me hundreds of hours of programming. I'm still wedded to the idea of having an SQL interface for my data, and when you add in the great feature of table inheritance (which seriously reduces ORM impedance mismatch at the expense of tying you to PostgreSQL) I don't think the idea can be beat. (My code is another matter...)

...or I could just switch to using SQLObject, which is becoming a frequently observed keyword in Python fora.


CleverCS posts a cute article about combatting Web spam with TrustRank. Reminds me of Advogato's trust metric... I've been thinking that something similar would work for genome annotation for some time.

Code coverage of C/C++ extension modules for Python: Defeat

I've been (temporarily ;) defeated in my attempts to use gcov to do a coverage analysis of my paircomp tests. All of my tests are written in Python and use my extension API for the C++ library to exercise the C++ code; thus the C++ code exists in shared libraries. Unfortunately, gcov does not naively support shared libraries. Bummer.

First, I tried extracting the __bb_init_func code from libgcc.a and linking that into the shared libraries explicitly; that got rid of the error but didn't seem to actually enable coverage analysis.

My second attempt was to write a short C program that embedded Python in a C++ program into which my extension modules had also been compiled. That worked up to a point -- I got everything to compile, and coverage analysis was started -- but I couldn't import any of the Python extension modules without an error.

I'll sleep on it and see what I come up with... does anyone know of any other C code coverage analysis programs?


Spent most of the day doing biology things, but managed to help clean up a small CGIHTTPServer problem and also contributed a doc patch to distutils to fix my earlier complaints.

A couple of short responses to advogato users:

gnutizen asks about learning good C programming style. My suggestions are:

  • Read other people's code, a lot. Back in the early '90s when I first dove into C, I spent some time getting GNU utilities to compile on both an SGI and a weird BSD/SYSV crossover machine we had. I learned a helluva lot about C programming from that, especially with respect to portability.

  • Fix other people's code, a lot. Ditto above.

  • Work on small parts of some open source project or another. I worked on a conquer-like game called dominion, with a group of pretty good hackers. In the end I think the overall design was lacking, but the nitty gritty of each individual code file was crafted by very experienced C hackers.

  • Read a lot. For large-scale program design, Lakos's C++ book is fantastic; Stevens' book on UNIX Network Programming was a prime source of material for me before that. Books like Pragmatic Programmer and so on offer a lot of advice that seems too obvious to be useful, but is in fact quite useful.

Anyway, that's my 2 cents (FWIW, IMO ;). These days I find myself writing relatively little C++ code, and even less straight C code, but it's incredibly useful for hacking on other people's code.

etrepum says that "Platypus is not what you want for packaging Python applications". Without more of a reason, and never having used Platypus myself, I don't know why. However the page he points me towards contains not only py2app, which looks pretty cool, but also a variety of other very nifty looking Python tools for interaction with OS X.


Sysadminning is annoying & time consuming

I do contract sysadminning for a small lab that only really needs someone to keep an eye on a Linux box with a Web server and an e-mail server. I charge them relatively little, and in turn can tell them that I'm too busy to fix something if necessary. A good trade for a grad student...

Since I switched the system from RH to Debian my life has been much easier, but hardware has a way of stepping in and reminding you who is boss.

Today's doings:

1. reboot to test on-boot install of new USB disk. reboot fails.

2. discover that the problem is in MBR. further discover disk MBR is unfixable, although the data is 99% entirely accessible. (weird...)

3. spend 2-3 hours doing things like wiping the MBR on *all* of the disks and then having to fix partition tables, etc.

4. finally get to the point where the MBR on a separate SCSI disk is booting the right kernel, then running init etc. off of the original disk. system finally fully functional in a rather hacked kind of way.

5. dinner.

6. returning from dinner, back up entire functioning system to two other disks, plus a remote system. (take that, hardware!)

Now I just have to figure out how to best transfer the functioning system off of the occasionally malfunctioning drive and onto a separate Debian install on another drive. I hope it will be as simple as find+diff to locate changed files; I didn't have to change *that* much to begin with...

On the (only) bright side, I get to charge for all of this.

Did anyone else notice how !#%!# cheap those really convenient LaCie USB and Firewire drives are? Wow -- $200/250 portable gb.


Fun distutils factoid of the day:

python setup.py clean
doesn't remove your build/lib.* directories, so C++ extensions don't get recompiled. You have to do a
python setup.py clean --all
to force recompilation.

I think this is a documentation bug, since --help-commands says clean - clean up output of 'build' command rather than clean - clean up temp files from 'build' command.

As long as I'm complaining, why does

python setup.py --help
return so little useful information for package installers? You have to run
python setup.py --help-commands
to get a list of actual commands! This is a fine example of behavior built around programmers rather than users, I think ;).

I ran across the 'clean' issue because I have some C++ extension files that depend on a C++ library. I don't know how to make my setup.py care about the modification date of that library file, so my extension files are perenially out of date with respect to my actual library code.

The help-commands issue is something I run across every time I try to understand the distutils command line options.

...but enough whining. Here's something useful, instead ;). I ran across this cool OS X software today: my friend Nathan blogged about appscript, which together with Platypus make it easy to build & release simple Python apps for OS X. Very neat!

Two gems from The New Yorker:

Regarding Crichton's new book on the climate change "conspiracy":

What "State of Fear" demonstrates is how hard it is to construct a narrative that would actually justify current American policy. In this way, albeit unintentionally, Crichton has written a book that deserves to be taken seriously.

Regarding Bernie Kerik (Homeland Security ex-nominee):

"Officials have gotten into trouble for sexual misconduct, abusing their authority, personal bankruptcy, failure to file documents, waste of public funds, receiving substantial unrecorded gifts, and association with organized crime figures. It is rare for anyone to be under fire on all seven of the above issues." (Henry Stern)

In other news, haruspex (accurately) characterizes me as "someone-who-doesn't-get-Perl-and-probably-never-will". To be fair, I *did* "get it" back in the mid-'90s... Musta been all those drugs I took in '99 that turned me off of it. I do like this quote from Larry Wall:

Perl isn't really about safety. It's about getting where you're going, and enjoying the trip. It's more important to be a good driver than to have seven feet of sponge rubber all around your car.

I do need to do some Perl work here and there, and the question I have for someone knowledgeable (haruspex?) is this: are there any good guidelines for designing an OO interface in Perl? I've browsed around on the 'net and while it seems possible to do pretty much anything, I don't use Perl enough to know which package(s) are help up as good examples of OO Perl. Any pointers would be much appreciated (& acknowledged)...


On Web testing

Grinder isn't on many of the lists of Web testing tools I've seen, but it seems to be quite mature & gets some good press. Let me know if you try it and like it.

Charlie Stross & Perl

One of my favorite new sci-fi authors is a guy named Charles Stross. He rivals Iain Banks for plots that are turned 2 degrees to normality, and is a hacker/sysadmin by trade. It was therefore distressing to read his take on Perl:

... then along comes Randal or Tom or one of the other Perl Gods, and they deliver a half-line-long command that resembles line noise, is three times as efficient as the other solutions, and leaves you scratching your head.

Apparently this is a desirable feature of the language!?

Anyway, I have to admit he wrote the best description of Zope I've ever seen...


p.s. The Atrocity Archives is an amusing blend of a Cthulhu-like mythos and UNIX sysadminning. Let's just say that LARTing takes on a whole new meaning...

11 Jan 2005 (updated 11 Jan 2005 at 08:49 UTC) »
paircomp 0.9 (rc)

hooray. docs, tar.gz. It's only a small & simple comparative sequence analysis library for DNA, but it's been broken for about a month. (Asinine data structure, refactored yo' ass.)

Summary: complete reimplementation, now with regression testing. C++ library completely rewritten using the STL. 95% of the code is now tested via the Python API in one big mongo test script.

One more brick in the wall...

Ryan Tomayko comments on GPL vs the Python community. My work-related libraries are LGPLed, and my GUIs and Web interfaces are GPLed. Why? I'm an academic programmer, and my code is owned by Caltech. Neither I nor Caltech depend on income from these programs. However, I do intend to take them from job to job, and the GPL protects that. The L/GPL also protects Caltech. win/win.


30 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!