Recent blog entries for fdrake

Well, Guido's post to python-dev made me take a look at Advogato again, and found that there's now a password-recovery feature. So I'm back!

It's been a long time since I've posted here, and I need to update my "who am I" text as well.

The book I co-authored has finally arrived! Python & XML, from O'Reilly, is now available; it should start to appear in bookstores over the next week. Hopefully you'll find a copy in time for the holidays! It's not like there's anything else to do, now is there? ;-)

Woo hoo! It looks like I'll be talking about HotShot, the new Python profiler, at the next Python conference! I guess I need to get documentation and user-level tools checked into CVS over the next week. I think we're starting to get a grip on how we want the information to be presented; I just need to get the first stab at the code and documentation written. There are many opportunities for tools to be able to extract the information from the profile logs once I get a nicer API built up (currently in progress) and a sample tool or two bundled with Python.

Well! I certainly haven't written anything here in a long time.

I suppose I've been busy, but it's not easy to know just what I've done. I've learned a bit more about Zope and spent time working on a variety of documentation issues for Python. I've done a little bit more on the XML conversion project there, but mostly just maintenance so the preliminary conversion doesn't fall too far behind the actual definition of the markup in the LaTeX version. There have been a few new things added, but not too many.

Lately I've been working on a new profiler for Python, and this one shouldn't be so darn slow. Written entirely in C, HotShot never touches Python code and avoids the slow path calling PyCFunction objects; to the best of my knowledge, it's the first profiler that uses the new profiler/tracer API introduced in Python 2.2. I expect to check the code into the Python CVS repository tomorrow. It shouldn't be too hard to create a coverage tool using the same basic model, and that should be really fast -- the slowest thing about the profiler is getting the time using a system call. I guess the next thing to work on once we have a basic analysis tool will be to get timing information faster.

Oh, and I've written a book. Well, part of one anyway; I've never even met my co-author. The book, titled Python & XML, should be out by the end of the year. Watch for it!

Well, well, well... I've been busy lately! I've spent a fair bit of time looking into some of the evolving W3C specifications, especially the XML DOM stuff. It's a little interesting seeing the directions the specs are being pulled in -- not sure of the motivations, sometimes!

My own DOM implementation is growing, but only slowly at this point -- I've worked on my Level 3 Loading module, and have at least a few test cases for non-default behaviors.

I spent a little time this evening working on my scripts for conversion of the LaTeX documentation for Python to XML -- perhaps that's not such a hopeless goal! I think I need to think further about the following topics:

  • The document schema: Specifically, function and method signatures need some attention.

  • Composition: If I go with smaller documents for module references, say one file per Python module, I'll want a composition mechanism that lets me put together a nicely interlinked web of references from a collection of module references and other text. It would be nice to keep a single collection of module references which could accomodate multiple top-level views.

  • Formatting for display: I've started playing with XSLT, and might be able to use that for a lot of the formatting, but I'm not sure yet. I need to learn a lot more about it to manage the hyperlinking between module references in a reasonable way.

  • Typesetting: Yes, I still think this is useful! Perhaps not for the library reference, but certainly for most of the other documents. Displays just aren't good enough for extended reading. I can probably use XSLT to generate LaTeX similar to the current markup, and then use a variant of my current document classes to make it look good.

Not much time to write at the moment, so I'll keep this short.

The new Expat bindings definately offer some nice functionality, and there will be more to come once I've found time to fix some bugs in Expat itself. There's no way I'll be able to get to that until after the Python conference in Long Beach. Anyway, the latest bindings are present in the Python 2.1 CVS tree and in PyXML 0.6.4 and newer, but you need to have the new Expat library pre-installed for the newer features to be used.

The weak references implementation for Python has just about been refined to the point where we're happy with it -- I need to implement support for rich comparisons for the proxy objects, but that's all that needs to be done that I'm aware of. The code needs to be exercised more to make sure bugs are shaken out. Martin von Loewis and Neil Schemenauer have both made very valuable contributions to the implementation.

Acquisition is still bugging me, but I can wait for another time to talk about that. I really need to work on my talk for IPC9.

3 Feb 2001 (updated 3 Feb 2001 at 22:22 UTC) »

Well, I guess I've been busy, 'cause I've not written anything here for a while. I should develop a better habit of it.

For now, we're done with the second alpha release for Python 2.1, so that'll give us a little time to catch up in other work, like the stuff our employer really wants us to do. I've been working on a project for Zope called Parsed XML. There are a handful of us working on it, but I'm mostly working on the DOM core. This project is moving along quite rapidly, and I'm a little skeptical of saying that we're anywhere near ready for calling a release `stable.' For some reason, I doubt the choice will be mine! Regardless, it has been interesting working with acquisition for the first time -- it feels pretty fragile, though I think I understand why it's considered desirable in the context of Zope.

The continuation of the development of James Clark's Expat XML parser has been slow. Clark Cooper and myself are working on maintaining Expat and developing it to provide more information via the API so that scripting languages can pick up a lot more (Clark maintains the Perl interface to Expat, and I'm extending the Python binding). We're also building the parser as a dynamically-linked library so that client code doesn't run into problems when multiple components link to the library -- this is a problem that has been noticed in the context of Apache modules, and can easily be found elsewhere as well.

The new Python bindings to Expat will offer much more information to the application builder. I'm using the enhanced bindings in the Parsed XML project and hope to merge it into PyXML and Python after resolving a few more issues.

Python 2.1a2 contains my implementation of weak references, using an approach to invalidation which we think works really well with reference counting. My next task for Python will be to convert the build process for the documentation to use a non-recursive Makefile, similar to what Neil did for the build of the interpreter. It wouldn't take long if I'd just sit down and do it!

Well, Guido sent out the announcement about the move to Digital Creations today, so we can all breathe a sigh of relief:


Checkin mail from SourceForge this week hasn't been working at all, and we haven't been able to get ahold of them to see how we can help. It's really hard to develop anything when we don't know what anyone else is actually getting done, and patches aren't getting reviewed. Hopefully this will get fixed soon, but their MTA seems to be botched. That's what happens when you run sendmail with more traffic than it can handle. ;-(

I have managed to get a lot of tedious stuff done in spite of the SourceForge mail problems and wrapping up talks with our new employer. Most of the Python standard library has been run through Tim Peter's script, which gets rid of hard tabs, trailing whitespace, blank lines at the end of the files, and, most importantly, converts everything to 4-space indents! Now we just need to write the style guide and include it in the standard Python documentation. The XML package documentation is starting to fall into place, and I've written a lot more of the "LaTeX Primer" for the Documenting Python manual.

The people at iUnivere don't seem to be interested in updating their "on-demand" published Python manuals to recent versions of the documentation, so I'm interested in finding someone who wants to publish short runs or on-demand copies of the most interesting parts of the Python documentation. One of the problems is that few people want to buy printed copies, and the other is that there's a lot of it. Or are people really keen on printing their own?

(If you know of a publisher of open source documentation interested in short-run work, please let me know about them!)

I've started to receive comments from people who want to see the bzip2 version of the documentation kept around, with the suggestion that people are more likely to switch if they knew how much shorter the download was. I've changed the documentation download page so the file sizes are available; we'll see if that makes a difference.

While between employers, I created a new project on SourceForge; it's called "GPath -- A C Library for Path Algebras. A neat idea; we'll have to see if it's really useful, though!

Enough chatter, I need to walk the dog.

11 Oct 2000 (updated 11 Oct 2000 at 02:56 UTC) »

The Python 2.0 release candidate is out, and there's more than ever to do in the documentation. I've been culling through my accumulated email and making all the little changes that seem to be needed just after I've packaged the documentation, and expect to have a lot of the really small nits fixed in time for the final release.

I also really need to buckle down and work on the documentation for the XML package; there's a lot of new material there that needs to be written, but Martin & Paul are working on new text for some of the components most in need of documentation. I need to integrate some changes from them and get a new development copy up soon; I'll push one in an hour or so, but that won't have the new XML documentation in it.

The conversion tool to convert the LaTeX sources needs to be updated to the new XML code (xml.dom.minidom); the DOM object constructors have changed, and there may be some changes related to hackery used to do things like change the names of existing elements. --sigh--

Today's big accomplishment: I got my INBOX back below 2500 messages! Don't send me any new ones. ;)

Well, the documentation for Python 2.0 is mostly stable, and shows a lot of improvement over the 1.6 documentation. There are still a few things to do to work around LaTeX2HTML problems, but that seems to always be the case. Every LaTeX2HTML problem pushes me that much closer to getting back to the XML conversion.

I've set things up so that I can easily publish the HTML for what's in my working directory at our FTP server, so it'll be easier for people to review the state of the documentation without having to install all the tools needed to build it from the LaTeX sources.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!