I just noticed that there's a programming contest at SIGMOD this year. The problem is relatively simple and tractable, although there are some interesting wrinkles:
- The data is inserted "online" by 50 concurrent threads, which means there is no opportunity to do offline bulk build/reorganization of the index.
- Solutions need to provide serializability, including avoiding the "phantom problem" (although that shouldn't be too hard: next-key locking should work).
- Solutions are also penalized when they fail to meet a response-time SLA, which makes good performance about more than merely maximizing throughput.