Older blog entries for godoy (starting at number 3)

Ok... I've spent the day trying to fix some ugly problems on four of our books (even if only three of them really exist 'til now... The fourth should be ready on Monday). Jade, definitely, isn't my friend. It took me the whole day to fix these and even so I still have a huge problem to fix until Monday... I've hacked some Perl code too, but it wasn't really that difficult, so I think it isn't something that deserves to be mentioned in here.

I've found another problem this week that deserves some special attention: image scaling. There should be a way to get the phisical dimensions of an image and automatically find a scale factor for this image. We've found that we were using a wrongly made EPS file. Now we've corrected all of our scripts and amazingly books are 6 or 7 times smaller than before... :-) As I always say:

Give me enough resources and time, and I can give you exactly what you want

Unfortunately, time is something very scarce.

Back to the scaling problems... This tool should be able to read, at least, EPS2, JPEG and TIFF files and them scale them on printed media if necessary. It's output could be something like "50" for a 50% reduction or "150" for a 50% enlargement. I could use this information and fill the needed fields using a script of some kind (if you know any, please drop me a message!).

Another problem is that ImageMagick's convert is somewhat buggy... It doesn't preserves the pictures dimensions after converting it (we have some pictures created at The Gimp with a one squared centimeter size that gets very big after being converted to EPS). I should remember to write to it's developers, maybe we can help them with some feedback.

No news in the Emacs area, just the upgrade to fix the security bug. And, I was helping Andreas to correct our Emacs related packages.

That's it by now...

I've been doing some stuff related to DocBook and Jade. Dealing with JadeTeX's way of work is not the best thing. I was thinking about implementing some stuff based on the ESIS output...

The ESIS output seems to be better to parse but I don't know if handling this kind of information is the best thing to do. The ESIS output has a simple structure to parse:

(para Text )para

means the beginning of a paragraph, the text within it and the end of the paragraph. Although it is simple, this is also a simple example. Structures can be very complex when nested (e.g. a table with spanned rows and columns).

My problem, while writing this tool, would be writing the backends. They are the most important parts since they must understand what's being parsed and convert this data to the correct output. I think I'll be writing two backends, to start: DocBook to HTML (HTML has a very similar structure to DocBook and things should be easy to do) and DocBook to LaTeX (this backend would solve _lots_ of troubles with been facing here at Conectiva).

My knowledgement on LaTeX is bigger than my knowledgement in HTML. I must allocate some time to write that backend. As I've written on prior entries, my working tool will be Perl.

Another thing that I've been doing is using Mailman and Postfix to manage some very small lists (5 subscribers on our development team). I've found Mailman very easy to use and it can manage lists that doesn't require any complex validation of messages. It won't work, as an example, to a list that I admin with other people that has a script to check for empty messages, HTML messages, excessive quotes, etc. As it is a GPL program, we can try to enhance it to allow running some scripts before accepting the post. The problem is that I don't know Python. Ok... There are other guys to do that (Acme, maybe) or some time to learn...

The best achievement that I had today was seeing all my messages converted to Gnus NNML folders and with Gnus fetching and filtering it. It was cool!!!

Well, that's it. Time to go to write other stuff... Maybe more poetry, who knows?

I'm actually working in (re)structuring my team. We've implemented some internal mailing lists (MailMan is cool and it runs ok with Postfix as SMTPd) and started defining a work flow. Things are moving better now.

CVS implementations are ok too. I've started developing some scripts to validate and convert SGML files to PostScript and HTML. These converted files are going to be made available at FTP and web, so that authors can view their changes when they commit files. I have to check how to deal with pictures... I have to convert them all each time a new picture is added (I don't keep a checked out copy of every project because it would require a lot of disk space).

I've been thinking about using sockets with Perl to send a notice to another machine to checkout these sources and convert them. Our CVS server is going to be on a dedicated machine soon and we'll need to keep this functionality. I can also work with ncftpput or curl...

Days are going ok... I've reinstalled this Toshiba laptop (2100 CDT) and I finally can work on the same programs here at Conectiva and at home again.

With chaos' help, I'm doing lots of things in perl. How could I live without it before? :-) Actually I'm migrating some shell scripts and Makefiles to it but now I have a much more versatile tool.

I've restarted to work on Conectiva's DocBook packages. After all, I need them to have a sane environment here. :-)

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!