Advogato: Blog for agntdrake

I'll admit it; I never really got the point of why anyone would use XML until very recently. Every experience I've ever had involving someone wanting to use XML generally involved some crazed XML zealot going on about how we really needed to use some twisted, bizzare functionality of XSL and somehow this was supposed to be good design practice. I always regarded the people who love XML as being just the same as people who love PHP.

Don't get me wrong here. You can use PHP to write some very wonderful things. Infact, I used to use PHP quite a bit when php v2 was the in vogue web technology before I was indoctrinated into the world of Perl. What is wrong with PHP you ask? Being a programmer by nature, I have a very hard time letting other non-programmers close to any of the code I work on (and even certain other programmers at other times). On a lot of the projects I work on, I'm expected to play nice with everyone else on the team, and that's just not possible if joe-blow-artist whose responsibility is to make a couple jpeg images and cook up some HTML is breaking all of the code I've been working on. Code and content do not mix very well.

One of the truely great things about perl is its data types. I'm not sure why other languanges haven't embraced the concept of an associative array the way perl has with its hash data type (with the exception of possibly python with its tupples). About 95% of the code I work on involves loops and hashes.

So when designing the original Fez, I realized right away that something which would be really useful would be some way of scooping out data from a pile of RPM's and storing that information (along with the dependencies, individual files, versions, etc. etc.) in a complex data structure which could be accessed quickly to pull out relevant information. Essentially the data looks something like:

key -> key -> key -> value

which is really nothing more than a complex hash. It started occuring to me after the umpteenth time I ran into a similiar data structure that there was something to all this stuff. I was using a combination of Sleepycat's very nice DBM database and MLDBM to store all of this data, but with everything stored as binary data, it makes things a real pain in the ass to actually edit anything. This is of course what text files are for.

One of the other things I needed for Fez was a way of creating configuration parameters for the software because, as I was saying before, code and content don't mix. Originally I had made just a simple text file with statements like:

key = "value";

which I would read out as a simple hash, but this is a little limiting in the respect that there are name space problems (any given name should be able to belong to any given number of unique sets). So I recently started writing something which looked something like this:

[foo]
  key = "value";
  [foo2]
    key = "value";
  [/foo2]
[/foo]

and then have a function which populated a complex hash and made it very easy to actually represent a whole lot of data very easily in a text file.

The epiphany hit only a short time ago after a co-worker was asking me whether or not I was using XML with some of the new code I was working on at work. He handed me a couple of books (they were pretty much just Java and XML books, of course) which I perused while trying to figure what the hell he was getting at. The strength of XML doesn't seem to be in the DOM, or XSL or any of the clutter which makes learning about XML so difficult. The strength is in being easily able to represent a complex data structure as:

<foo>
  <key>value</key>
  <foo2>
    <key>value</key>
  </foo2>
</foo>

which I could then represent in perl as a hash of hashes. None of this is really an earth shattering discovery, but what hit me as being strange was that in the midst of all this I ran upon an an article on xml.com which was written only a couple of weeks ago entitled What's wrong with Perl and XML?. There are some 35 different modules on CPAN in the XML directory, some of which do really similiar stuff to this (like XML::Dumper, XML::Config, XML::Grove, XML::Registry), but why hasn't any one module become the defacto way of dealing with XML data easily? I guess sometimes even the simplest of ideas can be some of the most elusive.

Older blog entries for agntdrake (starting at number 1)