Raptor - a little tune up
I've been doing a bit of raptor tuning to get down the CPU and memory usage on large files. I was always afraid of premature optimisation and knew that things could be improved a lot if I did some profiling and cut down the big problems. The results on a 550,000 triple rdf/xml file went from 172.8s for Raptor 0.9.8 as released to 7.3s with the CVS sources - over 23x faster. The improvement was mostly due to:
- A lot less strlen() on strings I already had the length for elsewhere
- Removal of many short-lifetime malloc()/free() pairs (thanks to dmalloc)
- using a set for rdf:ID checking, rather than a list.