Older blog entries for jamesh (starting at number 295)

Packaging Python programs as runnable ZIP files

One feature in recent versions of Python I hadn’t played around with until recently is the ability to package up a multi-module program into a ZIP file that can be run directly by the Python interpreter.  I didn’t find much information about it, so I thought I’d describe what’s necessary here.

Python has had the ability to add ZIP files to the module search path since PEP 273 was implemented in Python 2.3.  That can let you package up most of your program into a single file, but doesn’t help with the main entry point.

Things improved a bit when PEP 338 was implemented in Python 2.4, which allows any module that can be located on the Python search path can be executed as a script.  So if you have a ZIP file foo.zip containing a module foo.py, you could run it as:

PYTHONPATH=foo.zip python -m foo

This is a bit cumbersome to type though, so Python 2.6 lets you run directories and zip files directly.  So if you run

python foo.zip

It is roughly equivalent to:

PYTHONPATH=foo.zip python -m __main__

So if you place a file called __main__.py inside your ZIP file (or directory), it will be treated as the entry point to your program.  This gives us something that is as convenient to distribute and run as a single file script, but with the better maintainability of a multi-module program.

If your program has dependencies that you don’t expect to find present on the target systems, you can easily include them up in the zip file along side your program.  If you need to provide some data files along side your program, you could use the pkg_resources module from setuptools or distribute.

There are still a few warts with this set up though:

  • If your program fails, the trace back will not include lines of source code.  This is a general problem for modules loaded from zip files.
  • You can’t package extension modules into a zip file.  Of course, if you’re in a position where the target platforms are locked down tight enough that you could reliably provide compiled code that would run on them, you’d probably be better off using the platform’s package manager.
  • There is no way to tell whether a ZIP file can be executed directly with Python without inspecting its contents.  Perhaps this could be addressed by defining a new file extension to identify such files.

Syndicated 2012-05-21 07:31:05 from James Henstridge

NBN talk at PLUG

Earlier in the week, I attended a PLUG discussion panel about the National Broadband Network.  While I had been following some of the high level information about the project, it was interesting to hear some of the more technical information.

The evening started with a presentation by Chris Roberts from NBN Co, and was followed by a panel discussion with Gavin Tweedie from iiNet and Warrick Mitchel from AARNet.

James Bromberger introducing the panel: Chris Roberts (NBN Co), Gavin Tweedie (iiNet) and Warrick Mitchel (AARNet)

One question I had was when they’ll get round to building out the network where I live.  There is a rollout map on the NBN Co site, but it currently only shows plans for works that will commence within a year.  Apparently they plan to release details on the three year plan by the end of this month, so hopefully my suburb will appear in that list.

The NBN is being built on top of three methods of connection: GPON fibre for built up areas, fixed LTE wireless (non roaming) for the smaller towns where it is not economical to provide fibre, and satellite broadband for the really remote areas.  All three connection methods provide a common interface to service providers, so companies that provide services over the network are not required to treat the three methods differently.  The wireless and satellite connections will initially run at 12Mb/s down and 1Mb/s up, while fibre connections can range from 25/5 to 100/40 (with the higher connection speeds incurring higher wholesale prices).  It should be quite an improvement over the upload speed  I’m currently getting on ADSL2.

Chris brought in some sample “User Network Interface” (UNI) boxes that would be used on premises with a fibre connection.  It provided 4 gigabit Ethernet ports, and 2 telephony ports.

The inside of a current generation NBN interface box

Rather than the 4 Ethernet ports being part of a single network as you’d expect for similar looking routers, each port represents a separate service.  So the single box can support connections to 4 retail ISPs, or for any other services delivered over the network (e.g. a cable TV service).  You would still need a router to provide firewall, NAT and wifi services, but since it only requires Ethernet for the WAN port there should be a bit more choice in routers than if you limit yourself to ones with ADSL modems built in.  In particular, it should be easier to find a router capable of running an open firmware like OpenWRT or CeroWRT.

The box also acts as a SIP ATA, where each of the two telephony ports can be configured to talk to the servers of different service providers.

It is also possible for NBN Co to remotely monitor the UNI boxes in people’s houses, so they can tell when they drop off the network.  This means that they have the ability to detect and respond to faults without relying on customer complaint calls like we do for the current Telstra copper network.

Since the NBN is supposed to provide a service equivalent to the current copper telephone network, the UNI box is paired with a battery pack to keep the telephony ports active during black outs, similar to how a wired telephone draws power from the exchange.  This battery pack is somewhat larger than the UNI box, holding a 7.2 Ah lead acid battery.  At 10W, this can keep the box running for around 8 hours.  The battery pack will automatically cut power before it is completely drained, but has an emergency switch to deliver the remaining energy at the expense of ruining the battery.

Next PLUG Event

If you’re in Perth, why not come down to the next PLUG event on March 26th?  It is an open source themed pub quiz at the Moon & Sixpence.  Last year’s quiz was a lot of fun, and I expect this one will be the same.

Syndicated 2012-03-17 14:47:37 from James Henstridge

pygpgme 0.3

This week I put out a new release of pygpgme: a Python extension that lets you perform various tasks with OpenPGP keys via the GPGME library.  The new release is available from both Launchpad and PyPI.

There aren’t any major new extensions to the API, but this is the first release to support Python 3 (Python 2.x is still supported though).  The main hurdle was ensuring that the module correctly handled text vs. binary data.  The split I ended up on was to treat most things as text (including textual representations of binary data such as key IDs and fingerprints), and treat the data being passed into or returned from the encryption, decryption, signing and verification commands as binary data.  I haven’t done a huge amount with the Python 3 version of the module yet, so I’d appreciate bug reports if you find issues.

So now you’ve got one less reason not to try Python 3 if you were previously using pygpgme in your project.

Syndicated 2012-03-11 15:04:20 from James Henstridge

Javascript Mandelbrot Set Fractal Renderer

While at linux.conf.au earlier this year, I started hacking on a Mandelbrot Set fractal renderer implemented in JavaScript as a way to polish my JS skills.  In particular, I wanted to get to know the HTML5 Canvas and Worker APIs.

The results turned out pretty well.  Click on the image below to try it out:

Mandelbrot Set Renderer

Clicking anywhere on the fractal will zoom in.  You’ll need to reload the page to zoom out.  Zooming in while the fractal is still being rendered will interrupt the previous rendering job.

All the calculations are done via web workers, so should not block the UI.  The algorithms used to calculate these types of fractals are easy to parallelise, so it was not particularly difficult to add more workers.  One side effect of this is that the lines of the fractal don’t always get rendered in order.

With Chromium, this maxes out all six cores on my desktop system.  In contrast, Firefox only keeps three cores busy.  As workers are not directly tied to operating system threads, this may just mean that Firefox allocates fewer threads for running workers.  I haven’t tested any other browsers.

Browser technology certainly has progressed quite a bit in the last few years.

Syndicated 2011-03-08 14:15:27 from James Henstridge

Using Mozmill to Test Firefox Extensions

Recently I’ve been working on a Firefox extension, and needed a way to test the code.  While testing code is always important, it is particularly important for dynamic languages where code that hasn’t been run is more likely to be buggy.

I had not experience in how to do this for Firefox extensions, so Eric suggested I try out Mozmill. which has been quite helpful so far.  There were no Ubuntu packages for it, so I’ve put some together in my PPA for anyone interested:

The packages are not quite up to the standard needed to go into Ubuntu yet (among other things, there are no man pages for the various commands), but they do work and shouldn’t eat your system.

Running mozmill tests is pretty easy, and can be done with a command like the following:

mozmill --addons=$PATH_TO_YOUR_EXTENSION \
    --show-errors --test=$PATH_TO_YOUR_TESTS

This will launch an instance of Firefox using a temporary scratch profile that loads your extension, and then run your tests.  The tests will run inside the Firefox instance with the results fed back to the mozmill utility.  When the tests complete, the Firefox instance will exit and the scratch profile deleted.

While many of the mozmill tests that Mozilla has written are relatively high level, essentially treating it as an user input automation system, you have full access to Mozilla’s component architecture, so the framework seems well suited to lower level unit testing and functional tests.

Tests are structured as simple javascript modules, and uses conventions similar (although not identical) to many other xUnit frameworks.  Any function whose name starts with “test” is a test.  If the module contains “setupTest” or “teardownTest” functions, they will be called before and after each test respectively.  If the module contains “setupModule” or “teardownModule” functions, they will be called before and after all the tests in the module run, respectively.

There is a “jumlib” module that you can import into your tests that provides familiar helpers like assertEquals(), etc.  One difference in their behaviour to what I am used to is that they don’t interrupt the test on failure.  On the plus side, if you’ve got a bunch of unrelated assertions at the end of your test, you will see all the failures rather than just the first.  On the down side, you don’t get a stack trace with the failure so it can be difficult to tell which assertion failed unless you’ve provided a comment to go with each assertion.

The framework seems to do the job pretty well, although the output is a little cluttered.  It has the facility to publish its test results to a special dashboard web application, but I’d prefer something easier to manage on the command line.

Syndicated 2011-02-20 01:30:01 from James Henstridge

linux.conf.au 2011

I’ve just got through the first one and a half days of LCA2011 in Brisbane. The organisers have done a great job, especially considering the flooding they have had to deal with.

Due to the venue change the accommodation I booked is no longer within walking distance of the conference, but the public transport is pretty good.  A bit more concerning was the following change to the wiki made between the time I left Perth and the time I checked in:

BYO Toilet Paper

I’ve been impressed with the conference talks I’ve been to so far. In particular, I liked Silvia Pfeiffer’s talk on audio/video processing with HTML5 – I’ll have to have a play with some of this. Today’s keynote was by Vint Cerf about the history of internet protocols and what the challenges will be in the future (e.g. InterPlaNet).

There was a talk today about Redis: it sounded like interesting technology, but the talk didn’t really give enough information to say when you’d choose it over other systems.

Syndicated 2011-01-25 03:06:08 from James Henstridge

Bagels

I made some bagels last night.  It was my second time using the recipe, so things went pretty well.  The boiling process gives the crust an interesting chewy texture I haven’t seen with other bread recipes I’ve tried.

I used this recipe (half wholemeal flour, half white), but made 12 slightly larger bagels rather than the 18 the recipe suggested.  I increased the boiling and baking time a bit to compensate.  They weren’t particularly difficult to make, but the boiling process was fairly time consuming, since I could only fit three at a time into the pot.

Bagels

Syndicated 2009-12-04 04:01:20 from James Henstridge

Launchpad code scanned by Ohloh

Today Ohloh finished importing the Launchpad source code and produced the first source code analysis report.  There seems to be something fishy about the reported line counts (e.g. -3,291 lines of SQL), but the commit counts and contributor list look about right.  If you’re interested in what sort of effort goes into producing an application like Launchpad, then it is worth a look.

Syndicated 2009-10-27 08:48:18 from James Henstridge

Seeking in Transcoded Streams with Rygel

When looking at various UPnP media servers, one of the features I wanted was the ability to play back my music collection through my PlayStation 3.  The complicating factor is that most of my collection is encoded in Vorbis format, which is not yet supported by the PS3 (at this point, it doesn’t seem likely that it ever will).

Both MediaTomb and Rygel could handle this to an extent, transcoding the audio to raw LPCM data to send over the network.  This doesn’t require much CPU power on the server side, and only requires 1.4 Mbit/s of bandwidth, which is manageable on most home networks.  Unfortunately the only playback controls enabled in this mode are play and stop: if you want to pause, fast forward or rewind then you’re out of luck.

Given that Rygel has a fairly simple code base, I thought I’d have a go at fixing this.  The first solution I tried was the one I’ve mentioned a few times before: with uncompressed PCM data file offsets can be easily converted to sample numbers, so if the source format allows time based seeking, we can easily satisfy byte range requests.

I got a basic implementation of this working, but it was a little bit jumpy and not as stable as I’d like.  Before fully debugging it, I started looking at the mysterious DLNA options I’d copied over to get things working.  One of those was the “DLNA operation”, which was set to “range” mode.  Looking at the GUPnP header files, I noticed there was another value named “timeseek”.  When I picked this option, the HTTP requests from the PS3 changed:

GET /... HTTP/1.1
Host: ...
User-Agent: PLAYSTATION 3
Connection: Keep-Alive
Accept-Encoding: identity
TimeSeekRange.dlna.org: npt=0.00-
transferMode.dlna.org: Streaming

The pause, rewind and fast forward controls were now active, although only the pause control actually worked properly. After fast forwarding or rewinding, the PS3 would issue another HTTP request with the TimeSeekRange.dlna.org header specifying the new offset, but the playback position would reset to the start of the track when the operation completed. After a little more experimentation, I found that the playback position didn’t reset if I included TimeSeekRange.dlna.org in the response headers. Of course, I was still sending back the beginning of the track at this point but the PS3 acted as though it was playing from the new point in the song.

It wasn’t much more work to update the GStreamer calls to seek to the requested offset before playback and things worked pretty much as well as for non-transcoded files.  And since this solution didn’t involve byte offsets, it also worked for Rygel’s other transcoders.  It even worked to an extent with video files, but the delay before playback was a bit too high to make it usable — fixing that would probably require caching the GStreamer pipeline between HTTP requests.

Thoughts on DLNA

While it can be fun to reverse engineer things like this, it was a bit annoying to only be able to find out about the feature by reading header files written by people with access to the specification.  I can understand having interoperability and certification requirements to use the DLNA logo, but that does not require that the specifications be private.

As well as keeping the specification private, it feels like some aspects have been intentionally obfuscated, using bit fields represented in both binary and hexadecimal string representations inside the resource’s protocol info.  This might seem reasonable if it was designed for easy parsing, but you need to go through two levels of XML processing (the SOAP envelope and then the DIDL payload) to get to these flags.  Furthermore, the attributes inherited from the UPnP MediaServer specifications are all human readable so it doesn’t seem like an arbitrary choice.

On the bright side, I suppose we’re lucky they didn’t use cryptographic signatures to lock things down like Apple has with some of their protocols and file formats.

Syndicated 2009-07-24 08:46:18 from James Henstridge

Watching iView with Rygel

One of the features of Rygel that I found most interesting was the external media server support.  It looked like an easy way to publish information on the network without implementing a full UPnP/DLNA media server (i.e. handling the UPnP multicast traffic, transcoding to a format that the remote system can handle, etc).

As a small test, I put together a server that exposes the ABC’s iView service to UPnP media renderers.  The result is a bit rough around the edges, but the basic functionality works.  The source can be grabbed using Bazaar:

bzr branch lp:~jamesh/+junk/rygel-iview

It needs Python, Twisted, the Python bindings for D-Bus and rtmpdump to run.  The program exports the guide via D-Bus, and uses rtmpdump to stream the shows via HTTP.  Rygel then publishes the guide via the UPnP media server protocol and provides MPEG2 versions of the streams if clients need them.

There are still a few rough edges though.  The video from iView comes as 640×480 with a 16:9 aspect ratio so has a 4:3 pixel aspect ratio, but there is nothing in the video file to indicate this (I am not sure if flash video supports this metadata).

Getting Twisted and D-Bus to cooperate

Since I’d decided to use Twisted, I needed to get it to cooperate with the D-Bus bindings for Python.  The first step here was to get both libraries using the same event loop.  This can be achieved by setting Twisted to use the glib2 reactor, and enabling the glib mainloop integration in the D-Bus bindings.

Next was enabling asynchronous D-Bus method implementations.  There is support for this in the D-Bus bindings, but has quite a different (and less convenient) API compared to Twisted.  A small decorator was enough to overcome this impedence:

from functools import wraps

import dbus.service
from twisted.internet import defer

def dbus_deferred_method(*args, **kwargs):
    def decorator(function):
        function = dbus.service.method(*args, **kwargs)(function)
        @wraps(function)
        def wrapper(*args, **kwargs):
            dbus_callback = kwargs.pop('_dbus_callback')
            dbus_errback = kwargs.pop('_dbus_errback')
            d = defer.maybeDeferred(function, *args, **kwargs)
            d.addCallbacks(
                dbus_callback, lambda failure: dbus_errback(failure.value))
        wrapper._dbus_async_callbacks = ('_dbus_callback', '_dbus_errback')
        return wrapper
    return decorator

This decorator could then be applied to methods in the same way as the @dbus.service.method method, but it would correctly handle the case where the method returns a Deferred. Unfortunately it can’t be used in conjunction with @defer.inlineCallbacks, since the D-Bus bindings don’t handle varargs functions properly. You can of course call another function or method that uses @defer.inlineCallbacks though.

The iView Guide

After coding this, it became pretty obvious why it takes so long to load up the iView flash player: it splits the guide data over almost 300 XML files.  This might make sense if it relied on most of these files remaining unchanged and stored in cache, however it also uses a cache-busting technique when requesting them (adding a random query component to the URL).

Most of these files are series description files (some for finished series with no published programs).  These files contain a title, a short description, the URL for a thumbnail image and the IDs for the programs belonging to the series.  To find out about those programs, you need to load all the channel guide XML files until you find which one contains the program.  Going in the other direction, if you’ve got a program description from the channel guide and want to know about the series it belongs to (e.g. to get the thumbnail), you need to load each series description XML file until you find the one that contains the program.  So there aren’t many opportunities to delay loading of parts of the guide.

The startup time would be a lot easier if this information was collapsed down to a smaller number of larger XML files.

Syndicated 2009-07-06 08:50:45 from James Henstridge

286 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!