robertc is currently certified at Master level.

Name: Robert Collins
Member since: 2003-07-13 13:41:43
Last Login: 2009-10-02 07:37:50

FOAF RDF Share This

Homepage: http://www.robertcollins.net/

Notes:

So advogato seems to have stuck....

I spend most of my time hacking OSS/FS, with the primary projects being:bzr
ubuntu

I've created a couple of utility projects:
libgetopt++ (c++ getopt adapter)
freegen (a freeswan config generator)

A few other projects I've created shall remain nameless, as they have been obsoleted or are otherwise currently non-viable.

And projects contributed to (non-complete, not always source, top of head list..):
libtool
automake
arch
Cygwin XFree86
libxml
libxslt
Squid (core developer but mostly idle)
Cygwin (ex core developer - idling for 4-5 years so I can't really claim core status :P (yay much time on linux :})
Cygwin setup (ex project maintainer)

Projects

Recent blog entries by robertc

Syndication: RSS 2.0

subunit version 2 progress

Subunit V2 is coming along very well.

Current status:

  • I have a complete implementation of the StreamResult API up as a patch for testtools. Thats 2K LOC including comeprehensive tests.
  • Similarly, I have an implementation of a StreamResult parser and emitter for subunit. Thats 1K new LOC including comprehensive tests, and another 500 lines of churn where I migrate all the subunit filters to v2.
  • pdb debugging works through subunit v2, permitting dropping into a debugger to work. Yay.

Remaining things to do:

  • Update the other language bindings – the C library in particular.
  • Teach testrepository to expect v2 input (and probably still store v1 for a while)
  • Teach testrepository to use pipes for the stdin of test runner backends, and some control mechanism to switch input between different backends.
  • Discuss the in-Python API with more folk.
  • Get code merged :)

Syndicated 2013-03-04 09:43:29 from Code happens

El cheapo 10Gbps networking

I’ve been hitting the limits of gigabit ethernet at home for quite a while now, and as I spend more time working with cloud technologies this started to frustrate me.

I’d heard of other folk getting good results with second hand Infiniband cards and decided to give it a go myself.

I bought two Voltaire dual-port Infiniband adapters – a 4X SDR PCI-E x4 card. And in a 2 metre 8470 cable, and we’re in business.

There are other, more comprehensive guides around to setting this up – e.g. http://davidhunt.ie/wp/?p=2291 or http://pkg-ofed.alioth.debian.org/howto/infiniband-howto-4.html

On ubuntu the hardware was autodetected; all I needed to do was:

modprobe ib_ipoib
sudo apt-get install opensm # on one machine

And configure /etc/network/interfaces – e.g.:

iface ib1 inet static
address 192.168.2.3
netmask 255.255.255.0
network 192.168.2.0
up echo connected >`find /sys -name mode | grep ib1`
up echo 65520 >`find /sys -name mtu | grep ib1`

With no further tuning I was able to get 2Gbps doing linear file copies via Samba, which I suspect is rather pushing the limits of my circa 2007 home server – I’ll investigate futher to identify where the bottlenecks are, but the networking itself I suspect is ok – netperf got me 6.7Gbps in a trivial test.


Syndicated 2013-02-25 01:40:43 from Code happens

Simpler is better – a single event type for StreamResult

StreamResult, covered in my last few blog posts, has panned out pretty well.

Until that is, that I sat down to do a serialised version of it. It became fairly clear that the wire protocol can be very simple – just one event type that has a bunch of optional fields – test ids, routing code, file data, mime-type etc. It is up to the recipient at the far end of a stream to derive semantic meaning, which means that encoding a lot of rules (such as a data packet can have either a test status or file data) into the wire protocol isn’t called for.

If the wire protocol doesn’t have those rules, Python parsers that convert a bytestream into StreamResult API calls will have to manually split packets that have both status() and file() data in them… this means it would be impossible to create many legitimate bytestreams via the normal StreamResult API.

That seems to be an unnecessary restriction, and thinking about it, having a very simple ‘here is an event about a test run’ API that carries any information we have and maps down a very simple wire protocol should be about as easy to work with as the current file or status API.

Most combinations of file+status parameters is trivially interpretable, but there is one that had no prior definition – a test_status with no test id specified. Files with no testid are easily considered as ‘global scope’ for their source, so perhaps test_status should be treated the same way? [Feedback in comments or email please]. For now I’m going to leave the meaning undefined and unconstrained.

So I’m preparing a change to my patchset for StreamResult to:

  • Drop the file() method altogether.
  • Add file_bytes, mime_type and eof parameters to status().
  • Make the test_id and test_status parameters to status() optional.

This will make the API trivially serialisable (both to JSON or protobufs or whatever, or to the custom binary format I’m considering for subunit), and equally trivially parsable, which I think is a good thing.


Syndicated 2013-02-23 08:47:02 from Code happens

First experience implementing StreamResult

My last two blog posts were largely about the needs of subunit, but a key result of any protocol is how easy working with it in a high level language is.

In the weekend and evenings I’ve done an implementation of a new set of classes – StreamResult and friends – that provides:

  • Adaption to and from the existing TestResult APIs (the 2.6 and below API, 2.7 API, and the testtools extended API).
  • Multiplexing multiple streams together.
  • Adding timing data to a stream if it is absent.
  • Summarising a stream.
  • Copying a stream to multiple outputs
  • A split out API for instructing a test run to stop.
  • A simple test-at-a-time stream processor that makes it easy to just deal with tests rather than the innate complexities of an event based interface.

So far the code has been uniformly simple to write. I started with an API that included an ‘estimate’ function, which I’ve since removed – I don’t believe the complexity is justified; enumeration is not significantly more expensive than counting, and runners that want to be efficient can either not enumerate or remember the enumeration from prior runs.

The documentation in the linked pull request is a good place to start to get a handle on the API; I’d love feedback.

Next steps for me are to do a subunit protocol revision that maps to the Python API, both parser and generator and see how it feels. One wrinkle there is that the reason for doing this is to fix intrinsic limits in the existing protocol – so doing forward and backward wire protocol compatibility would defeat the point. However… we can make the output side explicitly choose a protocol version, and if we can autodetect the protocol version in the parser, even if we cannot handle mixed streams we can get the benefits of the new protocol once data has been detected. That said, I think we can start without autodetection during prototyping, and add it later. Without autodetection, programs like TestRepository will need configuration options to control what protocol variant to expect. This could be done by requiring this new protocol and providing a stream filter that can be deployed when needed.


Syndicated 2013-02-19 10:52:57 from Code happens

More subunit needs

Of course, as happens sadly often, the scope creeps..

Additional pain points

Zope’s test runner runs things that are not tests, but which users want to know about – ‘layers’. At the moment these are reported as individual tests, but this is problematic in a couple of ways. Firstly, the same ‘test’ runs on multiple backend runners, so timing and stats get more complex. Secondly, if a layer fails to setup or teardown, tools like testrepository that have watched the stream will think a test failed, and on the next run try to explicitly run that ‘test’ – but that test doesn’t really exist, so it won’t run [unless an actual test that needs the layer is being run].

Openstack uses python coverage to gather coverage statistics during test runs. Each worker running tests needs to gather and return such statistics. The current subunit protocol has no space to hand this around, without it pretending to be a test [see a pattern here?]. And that has the same negative side effect – test runners like testrepository will try to run that ‘test’. While testrepository doesn’t want to know about coverage itself, it would be nice to be able to pass everything around and have a local hook handle the aggregation of that data.

The way TAP is reflected into subunit today is to mangle each tap ‘test’ into a subunit ‘test’, but for full benefits subunit tests have a higher bar – they are individually addressable and runnable. So a TAP test script is much more equivalent to a subunit test. A similar concept is landing in Python’s unittest soon – ‘subtests’ – which will give very lightweight additional assertions within a larger test concept. Many C test runners that emit individual tests as simple assertions have this property as well – there may be 5 or 10 executables each with dozens of assertions, but only the executables are individually addressable – there is no way to run just one assertion from an executable as a ‘test’. It would be nice to avoid the friction that currently exists when dealing with that situation.

Minimum requirements to support these

Layers can be supported via timestamped stdout output, or fake tests. Neither is compelling, as the former requires special casing in subunit processors to data mine it, and the latter confuses test runners.  A way to record something that is structured like a test (has an id – the layer, an outcome – in progress / ok / failed, and attachment data for showing failure details) but isn’t a test would allow the data to flow around without causing confusion in the system.

TAP support could change to just show the entire output as progress on one test and then fail or not at the end. This would result in a cognitive mismatch for folk from the TAP world, as TAP runners report each assertion as a ‘test’, and this would be hidden from subunit. Having a way to record something that is associated with an actual test, and has a name, status, attachment content for the TAP comments field – that would let subunit processors report both the addressable tests (each TAP script) and the individual items, but know that only the overall scripts are runnable.

Python subtests could use a unique test for each subtest, but that has the same issue has layers. Python will ensure a top level test errors if a subtest errors, so strictly speaking we probably don’t need an associated-with concept, but we do need to be able to say that a test-like thing happened that isn’t actually addressable.

Coverage information could be about a single test, or even a subtest, or it could be about the entire work undertaken by the test process. I don’t think we need a single standardised format for Coverage data (though that might be an excellent project for someone to undertake).  It is also possible to overthink things :) . We have the idea of arbitrary attachments for tests. Perhaps arbitrary attachments outside of test scope would be better than specifying stdout/stderr as specific things. On the other hand stdout and stderr are well known things.

Proposal version 2

A packetised length prefixed binary protocol, with each packet containing a small signature, length, routing code, a binary timestamp in UTC, a set of UTF8 tags (active only, no negative tags), a content tag – one of (estimate + number, stdin, stdout, stderr, file, test), test-id, runnable, test-status (one of exists/inprogress/xfail/xsuccess/success/fail/skip), an attachment name, mime type, a last-block marker and a block of bytes.

The std/stdout/stderr content tags are gone, replaced with file. The names stdin,stdout,stderr can be placed in the attachment name field to signal those well known files, and any other files that the test process wants to hand over can be simply embedded. Processors that don’t expect them can just pass them on.

Runnable is a boolean, indicating whether this packet is describing a test that can be executed deliberately (vs an individual TAP assertion, Python sub-test etc). This permits describing things like zope layers which are top level test-like things (they start, stop and can error) though they cannot be run.. and it doesn’t explicitly model the setup/teardown aspect that they have. Should we do that?

Testid is for identifying tests. With the runnable flag to indicate whether a test really is a test, subtests can just be namespaced by the generator – reporters can choose whether to be naive and report every ‘test’, or whether to use simple string prefix-with-non-character-seperator to infer child elements.

Impact on Python API

If we change the API to:

class TestInfo(object):
    id = unicode
    status = ('exists', 'inprogress', 'xfail', 'xsuccess', 'success', 'fail', 'error', 'skip')
    runnable = boolean

class StreamingResult(object):
    def startTestRun(self):
        pass
    def stopTestRun(self):
        passs
    def estimate(self, count, route_code=None, timestamp=None):
        pass
    def file(self, name, bytes, eof=False, mime=None, test_info=None, route_code=None, timestamp=None):
        """Inform the result about the contents of an attachment."""
    def status(self, test_info, route_code=None, timestamp=None):
        """Inform the result about a test status with no attached data."""

This would permit the full semantics of a subunit stream to be represented I think, while being a narrow interface that should be easy to implement.

Please provide feedback! I’ll probably start implementing this soon.


Syndicated 2013-02-15 01:31:09 from Code happens

170 older entries...

 

robertc certified others as follows:

  • robertc certified hypatia as Journeyer
  • robertc certified ctd as Apprentice
  • robertc certified MJ as Journeyer
  • robertc certified tseaver as Master
  • robertc certified thom as Journeyer
  • robertc certified spiv as Master
  • robertc certified k as Journeyer
  • robertc certified jaq as Journeyer

Others have certified robertc as follows:

  • hypatia certified robertc as Master
  • ctd certified robertc as Master
  • spiv certified robertc as Journeyer
  • thom certified robertc as Journeyer
  • jdub certified robertc as Journeyer
  • fxn certified robertc as Master
  • jaq certified robertc as Journeyer
  • thaytan certified robertc as Journeyer
  • ncm certified robertc as Master
  • daniels certified robertc as Master
  • brouhaha certified robertc as Master
  • dtucker certified robertc as Master
  • ChrisMcDonough certified robertc as Master
  • atai certified robertc as Master
  • mrd certified robertc as Master
  • nconway certified robertc as Master
  • metaur certified robertc as Master
  • pvanhoof certified robertc as Journeyer
  • jamesh certified robertc as Master
  • dragotown certified robertc as Master
  • jblack certified robertc as Master
  • ctrlsoft certified robertc as Master
  • okaratas certified robertc as Master
  • ianclatworthy certified robertc as Master
  • bcully certified robertc as Master
  • janneke certified robertc as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page