15 Jun 2000 godoy   » (Master)

I've been doing some stuff related to DocBook and Jade. Dealing with JadeTeX's way of work is not the best thing. I was thinking about implementing some stuff based on the ESIS output...

The ESIS output seems to be better to parse but I don't know if handling this kind of information is the best thing to do. The ESIS output has a simple structure to parse:

(para Text )para

means the beginning of a paragraph, the text within it and the end of the paragraph. Although it is simple, this is also a simple example. Structures can be very complex when nested (e.g. a table with spanned rows and columns).

My problem, while writing this tool, would be writing the backends. They are the most important parts since they must understand what's being parsed and convert this data to the correct output. I think I'll be writing two backends, to start: DocBook to HTML (HTML has a very similar structure to DocBook and things should be easy to do) and DocBook to LaTeX (this backend would solve _lots_ of troubles with been facing here at Conectiva).

My knowledgement on LaTeX is bigger than my knowledgement in HTML. I must allocate some time to write that backend. As I've written on prior entries, my working tool will be Perl.

Another thing that I've been doing is using Mailman and Postfix to manage some very small lists (5 subscribers on our development team). I've found Mailman very easy to use and it can manage lists that doesn't require any complex validation of messages. It won't work, as an example, to a list that I admin with other people that has a script to check for empty messages, HTML messages, excessive quotes, etc. As it is a GPL program, we can try to enhance it to allow running some scripts before accepting the post. The problem is that I don't know Python. Ok... There are other guys to do that (Acme, maybe) or some time to learn...

The best achievement that I had today was seeing all my messages converted to Gnus NNML folders and with Gnus fetching and filtering it. It was cool!!!

Well, that's it. Time to go to write other stuff... Maybe more poetry, who knows?

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!