14 Aug 2003 kai   » (Journeyer)


The following quote is from a document describing a calendaring architecture but I think it's a good rule for a much broader set of applications.
[Jan Grant] : The system's algorithms should be capable of operating in a mode where instant responses to requests are not required. That is, state transitions should not merge a request and a response.
Get used to writing programs this way. When half of the computers you're talking with are on Mars, it'll all be like this.

Syncronous, control-flow based languages make this hard, though.


It's always better in the summer :°)


I'm looking for a tool to extract structured data from semi-structured web documents, identifying records (and hopefully fields) by learning extraction rules automatically or semi-automatically from multiple-record web-pages. I only found some research papers, can somebody point me to a Free Software tool to start with ?

With every day, Google seems to become more PDF/citeseer-infested, but there is (almost) no competition left, sigh.


raph: Don't worry, marketing will coin a cool term, probably something containing "double", "ultra" or so ...


Linux Interface Project: Too hot for coding. Circular treemaps will have to wait.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!