13 May 2011 timj   » (Master)

Wikihtml2man Introduction (aka html2man, aka wiki2man)



What’s this?
Wikihtml2man is an easy to use converter that parses HTML sources, normally originating from a Mediawiki page, and generates Unix Manual Page sources based on it (also referred to as html2man or wiki2man converter). It allows developing project documentation online, e.g. by collaborating in a wiki. It is released as free software under the GNU GPLv3. Technical details are given in its manual page: Wikihtml2man.1.

Why move documentation online?
Google turns up a few alternative implementations, but none seem to be designed as a general purpose tool. With the ubiquituous presence of wikis on the web these days and the ease of content authoring they provide, we’ve decided to move manual page authoring online for the Beast project. Using Mediawiki, manual pages turn out to be very easily created in a wiki, all that’s then needed is a backend tool that can generate Manual Page sources from a wiki page. Wikihtml2man provides this functionality based on the HTML generated from wiki pages, it can convert a prerendered HTML file or download the wiki page from a specific URL. HTML has been choosen as input format to support arbitrary wiki features like page inclusion or macro expansion and to potentially allow page generation from other wikis than MediaWiki. Since wikihtml2man is based purely on HTML input, it is of course also possible to write the Manual Page in raw HTML, using tags such as h1, strong, dt, dd, li, etc, but that’s really much less convenient to use than a regular wiki engine.

What are the benefits?
For Beast, the benefits of moving some project documentation into an online wiki are:

  • We increase editability by lifting review requirements.
  • We are getting quicker edit/view turnarounds, e.g. through use of page preview functionality in wikis.
  • We allow assimilation of user contributions from non-programmers for our documentation.
  • Easier editability may lead to richer documentation and possibly better/frequently updated documentation.
  • Other projects also seem to make good progress by opening up some development parts to online web interfaces, like: Pootle translations, Transifex translations or PHP.net User Notes.

What are the downsides?
We have only recently moved our pages online and still need to gather some experience with the process. So far possible downsides we see are:

  • Sources and documentation can more easily get out of sync if they don’t reside in the same tree. We hope to be mitigating this by increasing documentation update frequencies.
  • Confusion about revision synchronization, with the source code using a different versioning system than the online wiki. We are currently pondering automated re-integration into the tree to counteract this problem.

How to use it?
Here’s wikihtml2man in action, converting its own manual page and rendering it through man(1):

  wikihtml2man.py http://testbit.eu/Wikihtml2man.1?action=render | man -l -

Where to get it?
Release tarballs shipping wikihtml2man are kept here: http://dist.testbit.eu/testbit-tools/.
Our Tools page contains more details about the release tarballs.

Have feedback or questions?
If you can put wikihtml2man to good use, have problems running it or other ideas about it, feel free to drop me a line about it. Alternatively you can also add your feedback and any feature requests to the Feature Requests page (a forum will be created if there’s any actual demand).

What’s related?
We would also like to hear from other people involved in projects that are using/considering wikis to build production documentation online (e.g. in manners similar to Wikipedia). So feel free to leave a comment about your project if you do something similar.

See Also

  1. New Beast Website – using html2wiki
  2. The Beast Documentation Quest – looking into documentation choices

flattr this!

Syndicated 2011-05-12 23:49:23 from Tim Janik

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!