11 Jul 2011 caolan   » (Master)

regression testing libreoffice filters

For regression testing LibreOffice filters I’ve now arranged things so that each import filter’s cppunit test comprises of three data dirs, a pass dir, a fail dir and an indeterminate dir. Files in pass must parse without error, those in fail are expected to fail, but fail gracefully by returning an error or throwing an exception, i.e. a crash is a fail on a “fail” test, while “can’t parse” is the expected pass state.

The pass/fail dirs are typically pre-filled in the tree with a small sample of tricky documents which get tested at every build time to ensure they remain working.

indeterminate dirs on the other hand are expected to be empty in the tree, and the cppunit tests don’t care if their contents can be parsed or not, only that they don’t crash. This is really convenient for searching for crashes in a large document collection (horde), given that its an order of magnitude faster than using the full application to load and layout the results.

So I/we can just take a large document horde of e.g. docs and throw them in sw/qa/core/data/ww8/indeterminate and run make -sr in sw and sit back and wait to see if anything in there is a crasher at the parser level. For extra goodness export VALGRIND=memcheck to run the whole lot under valgrind.

FWIW, today anyway

  1. All 3721 attachments of (alleged) mime-type application/msword in openoffice.org’s bugzilla pass without crash when placed into ww8/indeterminate. To be re-run under valgrind later
  2. And all (ok, only 128) attachments of (alleged) mime-type application/msword in freedesktop.org’s bugzilla pass under VALGRIND=memcheck when placed into indeterminate.

I’ve got doc, rtf, qpro, wmf, emf, hwp, lwp and sxw organized and pre-seeded with a sample handful files so far. Plenty more filters than that of course, but .doc is my current focus as the richest vein of available had-bugs-reported documents.

Syndicated 2011-07-11 11:58:27 from Caolan McNamara

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!