26 Jan 2001 jmg   » (Master)

Read the Tabs vs. Spaces article that ariya posted. He is right about point 1 being the core of the argument, but the problem is that if you use spaces instead of tabs, then it becomes hard for others to read your code. I personally use 8 space tabs because that is the FreeBSD style(9) guide lines say. This may sound strange to adopt a project's style guide for your own code, but if we could all agree on a single style, then everyone would have less issues with this argument. Or you could always switch to Python which forces style on you.

I personally agree the 8 space tab stops are good. If you ever get so deeply nested that you can't fit your code on one or two lines, then you need to create more functions for that piece of code. The general rule that if you tab in more than three of four times from your base function then you need to rethink the function is a good thing. If you write with 2 space tab stops, then it's easy to write functions that have about 20 loops in them (that only puts you have way across the screen) without even thinking about it. If you had 8 space tab stops, you'll have issues going beyond 6 nested loops.

I wrote an MARC binary to ascii conversion program last night, but I won't release it till I split it into functions, because the one big function goes a whole four indents in from the base of the function. For me this is too much, and writing more smaller functions makes the code easier to read.

Oh well, just ranting a bit about coding style.

Hmmm, should I rant about the whole binary vs. XML for machine exchange? The reason systems are getty so bloody slow is because they decided to trade a faster to read format for an easier to [human] parse format. If programers continue to decide to go for solutions like these, we will continue to need faster computers, but it doesn't have to be that way.

I was impressed with how easy to parse the MARC format was without giving up extra space and without dealing with endianness. To deal with endianness, they simply encoded the numbers in base10 ASCII. Of course, with python it was too easy to parse the "binary" MARC format to a list of dictionaries.

Now for a bit about python. I always forget to use try/except instead of if statements when it's more appropriate. One example is if you are adding a data element to a dictionary, and you may have duplicate tags. There are a few ways to deal with this. Simply start out using lists for your data elements (which is probably what I should do), or you convert it to a list once you get more than one. An example of the first is:

try:
	rec[tag].append(data)
except KeyError:
	rec[tag] = data
except AttributeError:
	rec[tag] = [rec[tag], data]
The second one would be like:
try:
	rec[tag].append(data)
except KeyError:
	rec[tag] = [data]
Now the latter one in some ways makes more sense, as then you don't have to find out if it's a list or not, and handle them differently, but it also means a bit of extra work in the case that multiple tags are the exception rather than the rule.

Oh well, enough mussing, now hopefully the 45gig IBM 75GXP drive I ordered will be waiting for me today when I get home. I was also lucky to get a couple 128MB PC133 DIMMs for only about $40 each. They were generic stock, but were CAS2 timing. What luck! Of course, I only happen to be using them in PC100 capabile hardware, but I'm debating about ordering a couple more.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!