monkeyiq is currently certified at Journeyer level.

Name: Ben Martin
Member since: 2001-11-04 13:15:24
Last Login: N/A

FOAF RDF Share This

Homepage: http://witme.sourceforge.net/libferris.web/

Notes:

Save Ferris!

libferris is the virtual filesystem / semantic data manager. If you want to mount XML, Evolution, Firefox, PostgreSQL as a filesystem then its all been done before ;) Ego is the file manager / data interaction tool.

An interesting experament, (though truely a sad result), I have 90+ freshmeat subscribers to the libferris project. Assuming that half of them are starving students on peasent level one income and that I'm subscribed to the project as well this allows 40+ folks who could possibly make a $10-$20 donation to the libferris project.

Now surely that money is not much and is laughable in comparison to what an experienced C++ coder with similar qualifications to myself could earn commercially on closed source but relatively speaking the potential $500/yr that a small donation from half my subscriber base would make to my current situation would be great. It also gives some warn fuzzyness that folks are getting what libferris is about.

/me waves money pan.

Projects

Articles Posted by monkeyiq

Recent blog entries by monkeyiq

Syndication: RSS 2.0

libferris and ego meet google earth. The new release 1.1.96 will allow nice integration with ge letting you do desktop searches from within google earth aswell as show where a file is located by clicking on it in ego.

See the top of this page for a screen animation.

Desktop search?

I noticed on planet KDE this post on desktop search. It mentioned not using xattr for metadata because some filesystems don't support it. I'd say that most filesystems don't, iso filesystems, NFS (depending on setup), http, ftp and xemacs don't. The simple solution to all this in libferris is to virtualise xattr just like the filesytem itself is virtual. So you store xattr in RDF when the underlying filesystem doesn't allow it.

I should also highlight that the tagging mentioned in the post referenced by the above post is already available and usable with libferris :) You can attach arbitrary metadata to virtual filesystem objects, index them and search based on that metadata. Indexing can be done in many formats, lucene, postgresql, RDF using redland (db4, sqlite, postgresql) or on an LDAP server.

For indexing see here. For EA stuff here.

FUSE.sf.net meets libferris

Yes folks, its true, you can now mount libferris through the kernel using fuse. The goodness of your xemacs session becoming a kernel filesystem, mounting firefox through ferris and fuse... mmm, filesystems ease the pain.

Still trying to get some more advanced article about libferris usage out there. Things are starting to get rather interesting now because of the stacked filesystems in libferris and ultimate exposure through fuse lets you do some rather funky things with data that comes from (and returns to) many and varied places.

Fighto time.

Recently a question was posed to me in which I tended to offer a reasonably off the cuff response for. This led to an interesting debate about if set<string> was going to be hugely slower than hash_set<string> for the exact case where hash_set<> should whip an AVL tree's butt: direct lookups.

So without going into that conversation I decided to benchmark the two std::collections from both stdc++ and stlport 4.x. This is using gcc 4.0.2 which is shameful as I should have a more recent gcc. I'll likely rereun it on icc and 4.1.x as well.

The core of the code is to read strings from cin and shove them into a std::list. During the set<> parts I create a set with the list (which will have dups) and then iterate the list 50 times looking for each entry (including dups again) in the built set<> or hash_set<>.

There is of course some cruft there to select the right container from stdc++ and stlport because hash_set is non standard.

    if( use_hash )
    {
        l_t::iterator e = l.end();
        for( l_t::iterator iter = l.begin(); iter != e; ++iter )
            hstrset.insert( *iter );
        for( int i=0; i<LOOKUPS; ++i )
            for( l_t::iterator iter = l.begin(); iter != e; ++iter )
                hstrset.find( *iter );
    }
    else
    {
        l_t::iterator e = l.end();
        for( l_t::iterator iter = l.begin(); iter != e; ++iter )
            strset.insert( *iter );
        for( int i=0; i<LOOKUPS; ++i )
            for( l_t::iterator iter = l.begin(); iter != e; ++iter )
                strset.find( *iter );
    }

So the benchmarks, all compiled with -O9. Other gcc options don't seem to make any real effect. I created input from Gutenberg files, l.size is the number of words read. The hash_set methods are quicker for the completely degenerate case of only doing direct lookups and doing each of them at least 50 times per uniq word in the input.

Perhaps the most interesting point is the difference in speed between stlport and libstdc++ for this. I am now very interested to see how stlport5.x compares.


# Using stdc++::set<> foo$ time cat /tmp/largetxt.txt | ./string_xset l.size:273435 use_hash:0

real 0m16.980s user 0m16.493s sys 0m0.028s

# Using stlport::set<> foo$ time cat /tmp/largetxt.txt | ./string_xset_stlport l.size:273435 use_hash:0

real 0m10.184s user 0m9.821s sys 0m0.084s

# Using stdc++::hash_set<> foo$ time cat /tmp/largetxt.txt | ./string_xset 1 l.size:273435 use_hash:1

real 0m4.061s user 0m3.868s sys 0m0.024s

# Using stlport::hash_set<> foo$ time cat /tmp/largetxt.txt | ./string_xset_stlport 1 l.size:273435 use_hash:1

real 0m2.430s user 0m2.328s sys 0m0.012s

Moving back to blogging here for a while... Seems using university equipment for this is not so much an optimal thing. Loss of control over the network environment etc.

70 older entries...

 

monkeyiq certified others as follows:

  • monkeyiq certified monkeyiq as Master
  • monkeyiq certified edd as Master
  • monkeyiq certified dajobe as Master
  • monkeyiq certified ndw as Master
  • monkeyiq certified miguel as Master
  • monkeyiq certified jamesh as Master
  • monkeyiq certified fejj as Master
  • monkeyiq certified DV as Master
  • monkeyiq certified conrad as Master
  • monkeyiq certified bagder as Master
  • monkeyiq certified walken as Master
  • monkeyiq certified vektor as Master
  • monkeyiq certified campd as Journeyer

Others have certified monkeyiq as follows:

  • monkeyiq certified monkeyiq as Master
  • yakk certified monkeyiq as Journeyer
  • async certified monkeyiq as Journeyer
  • voltron certified monkeyiq as Journeyer
  • mjs certified monkeyiq as Apprentice
  • fxn certified monkeyiq as Journeyer
  • Uraeus certified monkeyiq as Journeyer
  • bjf certified monkeyiq as Journeyer
  • Chicago certified monkeyiq as Journeyer
  • mterry certified monkeyiq as Journeyer
  • dlc certified monkeyiq as Journeyer
  • ataridatacenter certified monkeyiq as Journeyer

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page