1 Mar 2004 awu   » (Journeyer)

New libre source for data mining on large data sets:

CloSpam -- closed frequent pattern mining, released under the University of Illinois/NCSA Open Source License.

Given a database of transactions, we would like to find frequent patterns in those transactions. For example, if we are running a large retailer, and maintain terabytes of customer transactions, we would like to find what items are bought together. In the data mining community, this is known as "frequent pattern mining".

There are an exponential number of possible frequent patterns in a large database, and thus it becomes important to avoid checking all such patterns.

CloSpam is a fast implementation of CLOSET, a recent data mining research paper in SIGMOD by Jian Pei, Jiawei Han, and Runying Mao.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!