Older blog entries for eivind (starting at number 5)

[Updated with reasons why you most likely would want to use the Compaq's data sets over Advogato's. This update came as a result of pvg's diary entry implying there was little reason to use Compaq's data sets.]

Ankh mentions using the correlation between evaluation from various people to find out what to present.

This is a field that has seen some study; it is known as "Collaborative Filtering" (or just "Personalization" when the marketeers have been there), have implementations available from Firefly (now purchased by Microsoft, IIRC - their website is down, so I can't check), Net Perceptions (the commericalized aspect of Grouplens, and probably a couple of other companies.

If you are going to experiment with this, picking up a book on multi-variate analysis is probably not a bad idea. You can get free datasets from Compaq for research use. These measure how well a bunch of people have liked films (from EachMovie, while that still ran). They are probably better to use than the Master/Journeyer/Apprentice ratings from Advogato, as I expect "How well did you like that film?" to be more likely to evaluate the same criterion in each person than the various Advogato ratings. Apart from that, I expect the data sets to be much larger - EachMovie was mass-targeted and seemed quite popular.

Not that this is terribly relevant as a form of diary for me, but given the restrictions of the communications medium... *grin*

Just saw another smart person release something under the GPL. I really, really need to get my act together and write up a single, coherent paper on how this hurt free software. I've started a couple of times (mostly in e-mail discussions), but never completed it.

Overall conclusion: The GPL only makes sense if you hate commerical software so much that you are willing to hurt free software to hurt commercial software, if you love gratifying your own ego more than you love free software, or if the building blocks you are using force you to use the GPL. There is a long buildup of economics to show this, something I was surprised at when I first discovered it.

Word to the wise: Do not pick up "The Turing Option" (Harry Harrison and Marvin Minsky, 1992) intending to read just the first 50 pages or so. I read the entire 500 pages in a stretch, procastinating the things I'd planned to do during those seven hours. That's almost a standard workday (if any such thing exists.)

Oh, well, I enjoyed the book.

No new free software stuff today.

I've spent the day doing a bunch of 'environment maintenance' (asking customers to stop having their system automatically bug me when the customer doesn't handle their own problems, fixing minor bugs, etc) and working on the design of two new frameworks: One to do distributed transaction & message handling, and one to do customizable user interfaces (including ability to plug new functionality into a user interface, and having multiple versions of a user interface that still use the same backend.)

Hairy, but fun. Now: Take the weekend.

Adding deltas on branches in CVSFile didn't work after all. I've fixed it, and added code to let it manipulate branches and symbols. This is enough that it should be possible to start doing fun stuff with it - I'm going to try to produce a merged FreeBSD/NetBSD/OpenBSD CVS repository the next time I get some time to hack.

Still no docs beyond the source code, though.

On a more personal level, a new piano miracolously showed up here without the need of going through the go-out-and-blow-a-lot-of-money routine I'd originally expected to go through to get one (my grandmother suddenly called and wanted to buy me one.)

This is cool, though the timing is lousy, as I'm moving in a few days. Can't have it all, I guess.

Finally got around to fixing up my perl module to manipulate RCS/CVS files enough that it might actually be useful for manipulation.

It is now able to read and write RCS/CVS files, to retrieve the text for any version in the file, and to add new versions on an already existing branch. It is robust enough on the read/write that I couldn't find any errors in a full pass through the FreeBSD repository (reading in every single file, writing it out, and looking through the diffs between the original and the new copy.)

Forward steps:

  1. Check that adding deltas on a branch work correctly
  2. Extend the Invariant check to be closer to a complete check of the invariants for the object
  3. Add support for adding new branches
  4. Add more useful accessor methods
  5. Add documentation
  6. Declare project complete, upload it to CPAN, and go for SCCSFile.pm

Preliminary code is at http://www.freebsd.org/~eivind/CVSFile-0.1.tar.gz.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!