dajobe is currently certified at Master level.

Name: Dave Beckett
Member since: 2002-07-21 12:45:56
Last Login: 2009-09-04 06:28:36

FOAF RDF Share This

Homepage: http://www.dajobe.org/


I'm a developer mostly working on metadata applications such as the Resource Description Framework (RDF) but I've also worked on PNG.

I strongly support open standards and freedom to create software for them. I scoff and taunt the fools who think software patents are a good idea.

My most recent project is Redland a free software/open source RDF library in C, with language interfaces in Perl, Python, Tcl, Java and Ruby along with its related projects Raptor for RDF syntaxes and Rasqal for RDF querying.

In the past I've been involved in promoting PNG and developing tools for it such as pngmeta

For my current projects and software, see my home page or my journal where I bookmark things I'm looking at and doing.


Recent blog entries by dajobe

Syndication: RSS 2.0

Redland and Raptor Bughunt

All praise valgrind! It finally managed to find where redland was crashing when used with PHP and one class. It was a combination of PHP and SWIG conspiring to do the wrong thing with a null string and NULL object pointer, a configuration error on my part.

This debugging was made much harder by the annoying things that are threads, which seem to be used more and more with useful shared libraries, causing debugging nightmares. Does anyone understand how to get gdb to do the right thing with this? I certainly don't, mostly facing a dead stack trace until the planets align right and it'll let me set a breakpoint in a shared-library that isn't loaded yet, let me run the code and stop at the breakpoint. Bah!

Anyway, onwards. Added some defensive code to try to catch this thing again and in the course of updating the debug code, found the __func__ pseudo-variable in C99 which is handy and can replace a lot of hand-coded bits.

Things seem to be building ok now, and the new MySQL backend for Redland is looking solid, so it's nearly time for another release. It always seems to take a week to do that, rather than sling the code out the door untested.

Planet RDF Last week I helped build Planet RDF based on existing code my friends had, with me mostly providing the additional hacking for fixing the mess that is HTML in RSS and the glue to make the thing update. It's looking good.

1 Nov 2003 (updated 1 Nov 2003 at 15:30 UTC) »

Cairo graphics debian packaging

In a fit of enthusiasm last weekend, I made debian packages (debs) of the Cairo project sources from CVS in order to get it building for Mono. Cairo is a vector graphics library for cross-device output intended to be similar to PDF 1.4; it was once called Xr.

I mentioned this to the Cairo people last week and, ta da, I'm now maintaining them and have CVS access. The debs for the cairo snapshots are presently hosted on freedesktop.org before they move to the main site, after some server reorganising. So if you add this area to /etc/apt.sources.list with

deb http://freedesktop.org/~cairo/debian/ ./

you can get a one-line install of the Cairo libraries without dependency hell.


Pushing the stack back to my original goal, Mono is now building for me from CVS, although I can't say I've tested the use of Cairo significantly, Monodoc using GTK# is working. Although building Mono from CVS is yet another story...

hacking life

And finally, as the #cairo channel on Freenode was discussing the hacker glider logo apparel (proceeds to EFF) I came up with the hacking life slogan for it. I think that works pretty well. The logos were, of course, made using Cairo.

20 Aug 2003 (updated 20 Aug 2003 at 19:48 UTC) »

Redland and Debian Packaging

Phew! My RDF/XML parser raptor (a C library, depends on libxml and cURL) has been in Debian sid/unstable for a few months now so it was time to attempt the big beast, redland. That's my main RDF system and as well as the C library, it has 6 other language interfaces - perl, python, ruby, java, tcl, php and I'm working on CLI/C#.

This list has felt a bit daunting for me to deal with, however after the raptor experience I was confident the C library part would at least be straightforward. Over the last 4 days I've added some of the languages slowly, while studying the mysteries of the Debian perl and python policies. To that I've added waiting for sid/unstable to be buildable so I can use pbuilder to check that I have all the correct dependencies, and decoding various packages to see how they did things.

So the current state is that I've got the C (and -dev), perl, python and ruby debian packages building without error in pbuilder, "lintian clean" and working once installed. The next step is to see about getting them into sid/unstable when my very busy sponsor edd has enough time to check them.

Raptor - a little tune up

I've been doing a bit of raptor tuning to get down the CPU and memory usage on large files. I was always afraid of premature optimisation and knew that things could be improved a lot if I did some profiling and cut down the big problems. The results on a 550,000 triple rdf/xml file went from 172.8s for Raptor 0.9.8 as released to 7.3s with the CVS sources - over 23x faster. The improvement was mostly due to:

  • A lot less strlen() on strings I already had the length for elsewhere
  • Removal of many short-lifetime malloc()/free() pairs (thanks to dmalloc)
  • using a set for rdf:ID checking, rather than a list.
18 Mar 2003 (updated 18 Mar 2003 at 23:31 UTC) »

Raptor and web libraries

Unlike in Java, Perl, Python and all those higher level languages, in C when you want to do something like retrieve a web page, there is a lot more to do. There aren't stdurl or stdweb libraries around that you can assume are always available. Since raptor is a parser for an XML language, libxml is one likely thing that is usable and it has a tiny HTTP implementation, sufficient for GET. There is the defacto portable web library libcURL and so I make that also configurable plus the W3C libwww which is common but rather large. So problem solved.

Or so I thought. It turns out that all those APIs except for the W3C libwww are push - they take the thread of control from the caller and return data to it via callbacks. However I wanted the more I/O stream-like pull i.e. the user application does while(...) { get stuff; do stuff }. You can wrap a push API around a pull one quite easily and efficiently, but not the other way around - you need to store all the pushed content then deliver it pull-by-pull. So, I'm going to have to live with that - provide both and warn users that the pull interface will suck up memory.

9 older entries...


dajobe certified others as follows:

  • dajobe certified edd as Master
  • dajobe certified ldodds as Journeyer
  • dajobe certified mbp as Master
  • dajobe certified tromey as Master
  • dajobe certified DV as Master
  • dajobe certified ndw as Master
  • dajobe certified gstein as Master
  • dajobe certified Uche as Master
  • dajobe certified vdv as Journeyer
  • dajobe certified wli as Master
  • dajobe certified meebey as Journeyer
  • dajobe certified mhausenblas as Apprentice

Others have certified dajobe as follows:

  • davb certified dajobe as Journeyer
  • ldodds certified dajobe as Master
  • fxn certified dajobe as Journeyer
  • vdv certified dajobe as Journeyer
  • edd certified dajobe as Master
  • sdodji certified dajobe as Master
  • mdupont certified dajobe as Master
  • Cardinal certified dajobe as Master
  • monkeyiq certified dajobe as Master
  • hackery certified dajobe as Master
  • connolly certified dajobe as Master
  • risi certified dajobe as Master
  • meebey certified dajobe as Master
  • ctrlsoft certified dajobe as Master
  • aradub16 certified dajobe as Master
  • pcburns certified dajobe as Master
  • softkid certified dajobe as Master
  • mhausenblas certified dajobe as Master
  • mnot certified dajobe as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page