New libre source for data mining on large data sets:
CloSpam -- closed frequent pattern mining, released under the University of Illinois/NCSA Open Source License.
Given a database of transactions, we would like to find frequent patterns in those transactions. For example, if we are running a large retailer, and maintain terabytes of customer transactions, we would like to find what items are bought together. In the data mining community, this is known as "frequent pattern mining".
There are an exponential number of possible frequent patterns in a large database, and thus it becomes important to avoid checking all such patterns.
CloSpam is a fast implementation of CLOSET, a recent data mining research paper in SIGMOD by Jian Pei, Jiawei Han, and Runying Mao.
