28 Nov 2004

I have now been able to get the introspector perl scripts to run on the output of rdfproc, a part of redland. All you need to use this now are just the redland, and there are debian packages for them. You can use many tools on this rdf, take a look at http://librdf.org for more information

You are going to want these packages for debian. librdf-perl - Perl language bindings for the Redland RDF library librdf0 - Redland RDF Application Framework librdf0-dev - Redland RDF library development libraries and headers libraptor1 - Raptor RDF Parser library libraptor1-dev - Raptor RDF parser and serializer development libraries and headers

Here are some good example data files : c-dump ntriples rdfxml example

These are two forms of rdf, ntriple and rdf/xml. You can use them with the introspector like this, example given with the ntriples :

1. gunzip the file gunzip c-dump.rdf.gz

2. make a redland repository rdfproc Global parse ntriples file:/ The Global is the name of the repository file:/ is the base address that can be what ever uri you want

That will create a repository in the current directory using berkleydb 6.2M Global-po2s.db -- predicate object index (used to find by field) 9.0M Global-so2p.db -- subject -object index (not used) 9.5M Global-sp2o.db -- subeject predicate index (graph traversal) 25M total

So you have about 9mb of indexes for a 500k zipped ntriples file.

The unpacked sizes are here : 13M Nov 28 15:34 c-dump.rdf 4.7M Nov 28 15:34 c-dump.ntriples

wc(wordcount) on c-dump.ntriples gives lines 96,818, words 387,292, chars 4,846,776

The original source file (expanded with headers) lines 13,270 words 27,221 chars 260,051(254K from ls) c-dump.i

So we are talking about 10x increase in size for indexing.

For example, i have installed the introspector into my home dir : /home/mdupont/EXPERIMENTS/introspector/introspector-0.7 The cvs version is up to date, You can download the release here from sf.net

so, to use it Go to the directory containing the rdf database files perl -I/home/mdupont/EXPERIMENTS/introspector/introspector-0.7 ~/EXPERIMENTS/introspector/introspector-0.7/recurse5.pl node_types:function_decl file:/

the node_types:function_decl is the node types that i am looking for, other interesting ones can be found in the Introspector/GCCTypes.pm file.

I hope that you take some time and play around with the introspector. It is not running perfect, but fast!

