I'm looking for a free software solution to index a big (CD-sized) collection of HTML documents (articles from a monthly news publication over a few years). The plan is to have pregenerated static indexes and all documents in plain HTML (should be usable everywhere) and then to offer some additional software for word / date / etc boolean queries and query result management. The software must work on Linux, Win9x and above and MacOS X. May be it's possible to develop / reuse / adapt a plugin for IE and Netscape.
If you have any link or idea on how to achieve this or better places to ask, please let me know either on your diary or by email at guerby@acm.org
If of interest, I'll post a front page article. Thanks for any help!
