24 Mar 2003 dajobe   » (Master)

Raptor - a little tune up

I've been doing a bit of raptor tuning to get down the CPU and memory usage on large files. I was always afraid of premature optimisation and knew that things could be improved a lot if I did some profiling and cut down the big problems. The results on a 550,000 triple rdf/xml file went from 172.8s for Raptor 0.9.8 as released to 7.3s with the CVS sources - over 23x faster. The improvement was mostly due to:

  • A lot less strlen() on strings I already had the length for elsewhere
  • Removal of many short-lifetime malloc()/free() pairs (thanks to dmalloc)
  • using a set for rdf:ID checking, rather than a list.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!