13 Oct 2002 follower   » (Journeyer)

Anyone wanna fill in a form?
Wow, filling in forms is fun, wanna try? I'm filling in forms (3) to apply for a student exchange work visa to the USA. It seems that since certain "international incidents" there's now three times as many forms as normal. Although, interestingly enough, one of them you only have to fill in if you're a male aged 18-45.

Still, I'm looking forward to going. Oh, did I mention I'm heading to the USA from the end of November to mid-April. WooHo! I'm finally getting overseas again! Yay!

Oh, did I mention I'm looking for some employment while I was there? :-) More on that later...

Jabber + Python + Bots
Haven't done too much on this lately, there's an ongoing issue with the translation side of things and Unicode encoding which is a pain but haven't really looked too deeply at it yet. (Too busy filling in forms. :-) )

Oh, well, I mean apart from using my Jabber bot to post all this of course. (Which I'm currently using just because I can. Ooo, gratuitous use of technology...)

Ethereal
Okay, "cool tool|toy" award for this week definitely goes to Ethereal http://www.ethereal.com/ (in addition to the roll-over awards to Python + Jabber). I've been working on a issue at work where the firewall/proxy (MS ISA...) seems to be stopping a certain java application from functioning correctly... And, of course, I can't get access to anything near the firewall to find out what's going on so I'm using Ethereal to find out what's going back and forth in the firewalled vs. non-firewalled situations. It's possibly overkill as the application is essentially http+text based, but it's definitely helpful. I made a breakthrough on Saturday I think but I'll have to wait until tomorrow probably to see if that's the case...

Matching text strings in packets with Ethereal
The one issue I hit with ethereal was that it doesn't seem possible to search data payloads to find arbitrary text strings at arbitrary locations. Because I'm tracking http transactions mostly, this was a bit of a handicap. So, after looking to see if anyone else has written something to do this (perl psuedo-code was the only thing I found.) I decided to put a Python script together to do the job.

The script searches the uncompressed pcap file that Ethereal creates and locates the string you're looking for (if present). Then, in order to tell you the sequence number of the packet, it searches backward to find the start of the ip packet so it can locate the sequence number. I guess I could also have worked through the capture and treated it as packets all along, but this seemed... more expedient. :)

The issue is, of course, how can you locate the start of an IP packet starting at an arbitrary location within its payload? I suspect someone else has probably written a white-paper on it somewhere, but I figured that since I knew the two ip addresses involved in the transaction and they were located at a known offset within the packet header I'd use them as my markers. (The advantage being that the likely hood of finding a false positive was pretty low.)

(Of course, the other option was to look for the byte '0x06' which signifies the protocol type as TCP in the IP header, and then work back from there to the offset where you expect the IP version type (0x4 in this case) and if you find it you've probably found the header. I suspect there may be greater risk of false positives doing it that way though, plus it'd probably be slower, maybe, not that that really matters. I can probably provide this method as an option tho.)

So anyway, having located the sequence number it displays it. Then, when it's found all occurrences it prints out an Ethereal display filter which will display only those packets which contain the matched text string. A copy and paste later, and Ethereal's back to helping again...

Well, it worked for me anyway. :-)

Geeky, moi?
All up, it was a fun exercise, poking through the IP & TCP header specs and semi-reverse-engineering things. (Although that kinda came more later.) Gee, how geeky am I?

Heh, of course, I'm just waiting for the response "Oh, such and such does that, it's here: < insert site here>" or "Crappy method for finding the sequence number, you could just do: < insert one line of Python code here>". :-) But, amongst other things it showed that maybe I did learn something in my networking paper last year after all. And, it helped me find what I'm probably looking for (certain POST requests seem to be getting their responses dropped) at the time I needed to find it.

Oh, yes, my intention is to throw the short script on my site, but since I wrote it on my employer's time I'm going to check with them that they don't mind. (Don't see why they would, small price to pay in exchange for being able to use Ethereal...)

Wow...
That was kinda long, wasn't it. You don't really notice when you're sending it one message at a time... :-)

User info updated
I've also updated my Advogato user info to include my PjBot and NoteTaker projects. (Oh, and I just compiled what I think is my first ever C++ program...)

And, of course, hello to ressu. I think his random url for today is http://www.google.com :)

@20021014

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!