11 Apr 2000 eivind   » (Master)

[Updated with reasons why you most likely would want to use the Compaq's data sets over Advogato's. This update came as a result of pvg's diary entry implying there was little reason to use Compaq's data sets.]

Ankh mentions using the correlation between evaluation from various people to find out what to present.

This is a field that has seen some study; it is known as "Collaborative Filtering" (or just "Personalization" when the marketeers have been there), have implementations available from Firefly (now purchased by Microsoft, IIRC - their website is down, so I can't check), Net Perceptions (the commericalized aspect of Grouplens, and probably a couple of other companies.

If you are going to experiment with this, picking up a book on multi-variate analysis is probably not a bad idea. You can get free datasets from Compaq for research use. These measure how well a bunch of people have liked films (from EachMovie, while that still ran). They are probably better to use than the Master/Journeyer/Apprentice ratings from Advogato, as I expect "How well did you like that film?" to be more likely to evaluate the same criterion in each person than the various Advogato ratings. Apart from that, I expect the data sets to be much larger - EachMovie was mass-targeted and seemed quite popular.

Not that this is terribly relevant as a form of diary for me, but given the restrictions of the communications medium... *grin*

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!