Older blog entries for dan (starting at number 164)

Something old, something new

Sorry for the silence. This has been a scheduling thing mostly: I managed to schedule ‘breaking my arm’ for about three days after ‘starting a new job’, and together with the ongoing project of ‘being a new Dad’ (I plan to pivot that one into just ‘being a Dad’ if and when I finally feel like I know what I’m doing – so maybe in about 20 years or so) this has left not much time for discretionary writing.

Something old: a couple of months ago Vsevolod Dyomkin emailed me to ask if I wanted to be featured in his series of interviews with Lisp hackers (or, in my case, former Lisp hackers). After keeping him waiting for unsnsionably long – partly due, of course, to the aforementioned bone breakage – finally I have sent him my answers and he has posted them

(I can’t decide if I’m actually serious about writing something justifying the existence of ASDF, or if I’m just going to let it lie. It can probably be summarised as “it made sense at the time”)

Something new:

  • New job: I’m now working at Simply Business, nominally as a Ruby programmer but in practice so far mostly doing Puppet and Vagrant and general devops-y deployment-y stuff. It’s slightly odd having a commute further than the spare bedroom (which is no longer actually spare now, please note) but I seem to have adjusted mostly to having to get up in the morning, and the walk to work is podcast time. Except when it’s Spotify time. Or when I take a bike, but for a 2km journey that often seems a bit like overkill.
  • New consumer toys. After dropping the Desire S and breaking the screen on it twice, I gave up and sniped a second hand Galaxy Nexus on Ebay. It’s a bit like running an Android phone, except that it has no stupid skins or preloaded apps, and it gets timely updates to the newest OS version. Seriously, there is a certain irony in the fact that a phone I chose because it was open and easy to reflash or root turns out to be the first Android phone I’ve owned that I haven’t needed to reflash or root. Flushed with that success (and, for the first time in a while, flush also with some ready cash), I bagged a tablet too. Nexus 7, Jelly Bean again, it’s basically like a big version of the phone except that it doesn’t make phone calls.

Syndicated 2012-10-08 18:30:47 from diary at Telent Netowrks

Debian on the Samsung Series 9 NP900X3B

There are other guides to getting Linux going on the Samsung Series 9 NP900X3B, but both that I’ve found are for Fedora. Mostly it’s the same in Debian, but here are some of the things I’ve spotted.

Install media

Recent versions of Debian install media images are created as hybrid ISO images, which means you can download them and dd them directly to a USB stick. This is what I did, with the Squeeze netinst image. The computer’s BIOS settings needed changing to look at USB before the internal SSD, but that’s not hard. I deselected all packages which resulted in a very minimal basic install, then added xfce4 and a few essential utilities using apt-get


The wired network works out of the box.

The wireless networking is courtesy of an Intel 6230 adapter “Intel Corporation Centrino Advanced-N 6230 [8086:0091] (rev 34)” (apparently this also does Bluetooth, but I haven’t tried that yet). This is not supported in Squeeze’s default kernel, but is available in Wheezy. After much swearing at backports I decided to do the apt-get upgrade dance


Touchpad handling worked in Squeeze and partially broke when I upgraded to get working wifi. Pointer movement worked fine, but tapping (for the uninitiated, “tapping” on the touchpad is simulating button presses by briefly touching the pressure-sensitive area instead of the hardware buttons below it) didn’t. On this system tapping is infinitely preferable to the hardware buttons, because it appears impossible to move the pointer while one of the hardware buttons is pressed – this makes window placement pretty tricky. The fix here is

synclient TapButton1=1 
synclient TapButton2=2 
synclient TapButton2=3 
synclient PalmDetect=1

which means you can use one finger to simulate button 1, two fingers simultaneously (note: not double-clicking, as I foolishly initially thought) to simulate button 2, and three for button 3. It also turns on palm detection, so that accidentally brushing the pad as you type won’t send your cursor off into the wild blue yonder.

This affect the current session only. To make it permanent you need to edit files: copy /usr/share/X11/xorg.conf.d/50-synaptics.conf to /etc/X11/xorg.conf.d and add the lines

        Option "TapButton1" "1"
        Option "TapButton2" "2"
        Option "TapButton3" "3"
        Option "PalmDetect" "1"

in the first Section "InputClass" stanza

Suspend and hibernate

I had the same problem with suspend as John Teslade : it appears not to resume but in fact it works perfectly except for the display backlight. However, I had it harder because my Fn-F2 and Fn-F3 keys didn’t do anything. After determining with acpi_listen that Linux is listening to those keys (they send ACPI events video/brightnessdown and video/brightnessup respectively) and is capable of controlling the brightness (try e.g. xrandr --output eDP1 --set BACKLIGHT 1) I decided this must clearly be a 90% solved problem and that the missing link was probably somewhere in the Debian package archive. It was, it was xfce4-power-manager

After that, suspend and hibernate are both usable.

Battery life

Pretty poor right now (looks like about 3 hours), but I’ve just installed laptop-mode-tools which has turned most of the PowerTOP tunables from “Bad” to “Good”, so I will be disappointed if that doesn’t make a significant difference. We’ll see …

Syndicated 2012-05-28 10:40:28 from diary at Telent Netowrks

Changing sort order with ActiveRecord find_in_batches

The subject is Googlebait pure and simple, because the short version is that you can’t. The very slightly longer and significantly more defensible version is that you may be able to but they don’t want you to apparently because you might get the wrong answer if you mutate the data as you’re traversing it (and because it’s fast in MySql).

Personally I think the answer there is Well Don’t Do That Then (and who cares about MySql) but that’s just my opinion. If you want to order by, say, created_at descending, and perhaps you want to paginate the results, the only sensible conclusion to draw is that find_in_batches is just not intended for this use.

But it’s not an unreasonable use. So I wrote ar-as-batches which lets you do

Users.where(country_id: 44).order(:joined_at).offset(200).as_batches do |user|

and it all works as you’d expect. I should of course caution you that

  • you might get the wrong answer if you mutate the data as you’re traversing it (so Don’t Do That Then), and that
  • ordering by something other than id ascending may be slower in MySql.

I don’t know whether to be proud or ashamed of the tests , which check the generated queries by assigning a StringIO logger to ActiveRecord::Base.logger and then matching regexps in it after each test runs. There ought to be a better way. Perhaps there is a better way. Don’t know what though.

Syndicated 2012-05-04 06:29:34 from diary at Telent Netowrks

Listening to a Polar Bluetooth HRM in Linux

My new toy, as of last Friday, is a Polar WearLink®+ transmitter with Bluetooth® because I wanted to track my heart rate from Android. Absent some initial glitches which turned out to be due to the battery it was shipped with having almost no charge left, it works pretty well with the open source Google My Tracks application.

But, but. A significant part of my exercise regime consists of riding a stationary bicycle until I feel sick. I do this in the same room as my computer: not only are GPS traces rather uninformative for this activity, but getting satellite coverage in the first place is tricky while indoors. So I thought it would be potentially useful and at least slightly interesting to work out how to access it directly from my desktop.

My first port of call was the source code for My Tracks. Digging into src/com/google/android/apps/mytracks/services/sensors/PolarMessageParser.java we find a helpful comment revealing that, notwithstanding Polar’s ridiculous stance on giving out development info (they don’t, is the summary) the Wearlink packet format is actually quite simple.

 *  Polar Bluetooth Wearlink packet example;
 *   Hdr Len Chk Seq Status HeartRate RRInterval_16-bits
 *    FE  08  F7  06   F1      48          03 64
 *   where; 
 *      Hdr always = 254 (0xFE), 
 *      Chk = 255 - Len
 *      Seq range 0 to 15
 *      Status = Upper nibble may be battery voltage
 *               bit 0 is Beat Detection flag.

While we’re looking at Android for clues, we also find the very useful information in the API docs for BluetoothSocket that “The most common type of Bluetooth socket is RFCOMM, which is the type supported by the Android APIs. RFCOMM is a connection-oriented, streaming transport over Bluetooth. It is also known as the Serial Port Profile (SPP)”. So, all we need to do is figure out how to do the same in Linux

Doing anything with Bluetooth in Linux inevitably turns into an exercise in yak epilation, especially for the kind of retrocomputing grouch (that’s me) who doesn’t have a full GNOME or KDE desktop with all the D buses and applets and stuff that come with it. In this case, I found that XFCE and the Debian blueman package were sufficient to get my bluetooth dongle registered, and to find and pair with the HRM. It also included a natty wizard thing which claimed to be able to create an rfcomm connection in /dev/rfcomm0. I say “claimed” not because it didn’t – it did, so … – but because for no readily apparent reason I could never get more than a single packet from this device without disconnecting, unpairing and repairing. Perhaps there was weird flow control stuff going on or perhaps it was something else, I don’t know, but in any case this is not ideal at 180bpm.

So, time for an alternative approach: thanks to Albert Huang, we find that apparently you can work with rfcomm sockets using actual, y’know, sockets . The rfcomm-client.c example on that we page worked perfectly, modulo the obvious point that sending data to a heart rate monitor strap is a peculiarly pointless endeavour, but really we want to write our code in Ruby not in C. This turns out to be easier than we might expect. Ruby’s socket library wraps the C socket interface sufficently closely that we can use pack to forge sockaddr structures for any protocol the kernel supports, if we know the layout in memory and the values of the constants.

How do we find “the layout in memory and the values of the constants”? With gdb. First we start it

:; gdb rfcomm-client
(gdb) break 21
Breakpoint 1 at 0x804865e: file rfcomm-client.c, line 21.
(gdb) run
Starting program: /home/dan/rfcomm-client

Breakpoint 1, main (argc=1, argv=0xbffff954) at rfcomm-client.c:22
22          status = connect(s, (struct sockaddr *)&addr, sizeof(addr));

then we check the values of the things

(gdb) print sizeof addr
$2 = 10
(gdb) print addr.rc_family 
$3 = 31
(gdb) p/x addr.rc_bdaddr
$4 = {b = {0xab, 0x89, 0x67, 0x45, 0x23, 0x1}}

then we look at the offsets

(gdb) p/x &addr
$5 = 0xbffff88e
(gdb) p/x &(addr.rc_family)
$6 = 0xbffff88e
(gdb) p/x &(addr.rc_bdaddr)
$7 = 0xbffff890
(gdb) p/x &(addr.rc_channel)
$8 = 0xbffff896

So, overall length 10, rc_family is at offset 0, rc_bdaddr at 2, and rc_channel at 8. And the undocumented (as far as I can see) str2ba function results in the octets of the bluetooth address going right-to-left into memory locations, so that should be easy to replicate in Ruby.

  def connect_bt address_str,channel=1
    bytes=address_str.split(/:/).map {|x| x.to_i(16) }
    sockaddr=[AF_BLUETOOTH,0, *bytes.reverse, channel,0 ].pack("C*")

The only thing left to do is the actual decoding. Considerations here are that we need to deal with short reads and that the start of a packet may not be at the start of the buffer we get – so we keep reading buffers and catenating them until decode says it’s found a packet, then we start again from where decode says the end of the packet should be. Because this logic is slightly complicated we wrap it in an Enumerator so that our caller gets one packet only each and every time they call Enumerator#next

The complete example code is at https://gist.github.com/2500413 and the licence is “do what you like with it”. What I will like to do with it is (1) log the data, (2) put up a window in the middle of the display showing my instantaneous heart rate and zone so that I know if I’m trying, (3) later, do some interesting graphing and analysis stuff. But your mileage may vary.

Syndicated 2012-05-03 21:10:29 from diary at Telent Netowrks

Sharpening the sawfish

My son is two weeks old today. I don’t usually go a bundle on putting personal info on the public web – I keep that for Facebook, where they at least pretend to keep it private for me - but I mention this to explain why I’m using my laptop a lot more than my desktop lately.

The problem with my laptop is the mouse pointer. It’s one of those pointing stick devices technically known (apparently) as an isometric joystick and more commonly known as a nipple , and when the room is warm the little rubber cap gets slippery very quickly. So I decided to invest a little time in a few keyboard shortcuts.

As an Emacs user I know I’m supposed to like tiling window managers, but I don’t. My editor windows are windows onto text files that may be any size and shape but in which it’s a fairly safe bet (see “locality of reference”) that the spot I want to edit next is usually spatially close to the spot I’m currently looking at. The other ‘windows’ on my screen are things like web browsers and GUI programs where there’s no such guarantee, and the only way to make them work is to allow them to take the size and shape that their authors wanted them to have. So after a brief experiment with awesome I punted it and went looking for a programable window manager that was designed for overlapping windows.

And ended up back with Sawfish, which I used to use back when it was fashionable. Sawfish customization is a two-phase process: first you write commands in Lisp, then you use the sawfish-ui program to assign them to keystrokes. A bit like Emacs, really, and perhaps not surprisingly.

First I needed some shortcuts to focus particular windows (Emacs, Firefox, xterms). Happily, someone has done the work for this already: I just had to download the Gimme script and set up bindings for it

Then I needed something to chuck windows around the screen. The requirement is pretty simple here: every window on my screen is aligned against an edge, so I just need commands to pan a window to each edge. Here is the finished script in which the points I would like to draw attention to are

  • I use focus-follows-mouse mode, or whatever it’s called these days. This means that if I move a window under the pointer I need to move the pointer too otherwise it will go out of focus. The warp-cursor-to-window function does this: I needed to calculate the pointer position relative to window, which for some reason isn’t a builtin.
  • window-frame-dimensions is window-dimensions plus the decorations. We need these dimenstions for throwing windows rightwards or downwards, otherwise they end up slightly offscreen.
  • define-command is the magic that makes our new functions show up in the sawfish-ui dialog. The "%f" sigil means to pass the current window into the function.

And that’s about it. Put the file somewhere that sawfish will find it – for me, ~/.sawfish/lisp seems to be a good place – add the lines

(require 'gimme)
(setq warp-to-window-enabled t)
(require 'throw-window)

to .sawfishrc, and then set up your keys in sawfish-ui. I assigned them to Windows-key shortcuts: at last, I have a use for the Windows key.

If you hadn’t spotted in amongst all that, I have githubbed my dotfiles. More for my convenience than for your edification, but feel free to rummage. If you are one of the three other remaining XTerm users, have a look at the XTerm*VT100*translations in my .Xdefaults - I stole that “press Shift+RET to background the command” trick from Malcolm Beattie nearly twenty years ago and have been using it ever since.

Syndicated 2012-02-22 20:25:20 from diary at Telent Netowrks

ANN Twitling: a Twitter link digest tool

Problem: I can’t keep up with the Internet

I often check Twitter on my phone. When I see tweets with links in them I tend to skip over them intending to return later when I’m on a computer with a full-size screen, and then forget about them either because I find something else to look at or I can’t be bothered with scrolling all the way down again. And looking through old tweets is nearly as bad on the full-size twitter web site as it is in a mobile client.

Proposed solution: I need a computer program to read the Internet for me

Thus, Twitling: a small script consisting of Ruby and Sinatra and OmniAuth and the Twitter gem and Typhoeus to grab links in parallel, the function of which is to read one’s timeline and display the resolved URL, the title and an excerpt from the text of each link that was posted. Source code is on Github.

I haven’t really used it myself yet in anger: the first thing I notice while testing it is that there are a whole lot more links in my feed than I thought there were, and the original plan to produce a 24 hour digest might become very unwieldy.

Possible further development ideas include

  • speed it up, by prefetching, better caching, or fetching the links asynchronously and client-side
  • an “older” link at the bottom of the page
  • Atom/RSS output so it gets fed to me every so often and I don’t have to remember to check it
  • email output (for the same reason)
  • some css gradient fills just to make it look modern (hey, I already used text-shadow, what do you want, round button borders?)
  • your suggestion here: email dan@telent.net or open an issue on Github. Bug reports too.

Try not to break it, please.

Syndicated 2012-02-01 12:54:57 from diary at Telent Netowrks

backbone.js 1 0 jQuery

I’ve spent a few hours over the last couple of days figuring out how to use backbone.js, and so far I’m impressed by it: it solves a real problem elegantly and doesn’t seem to have an entire religion bolted onto the side of it.

5 minute summary: it introduces models (and collections of them) and views to client-side javascript, and connects them with a publish/subscribe event notifier system so that when you make changes to a model all the views of it update without your having to remember to do anything to them.

A Model is an object that knows how to update itself from a Rails-y “REST” server (scare quotes, because as we all know these days REST isn’t what you think it is ), and publishes its attributes using the methods set and get.

	var m=find_me_a_model();
	var selected= (m.has('selected')) ? m.get('selected') : false;
	m.set({selected:  !selected});

Calling set will, if the value has changed, trigger a changed event handler to be called in all objects which have bound to it. These objects are usually Views.

A View is an object with a render method and an el attribute, and in which calling the former creates a piece of DOM tree in the latter, which you can then attach to your document somewhere

    initialize: function() {
    // ... this is not working code - I missed out some important bits ...
    events: {
	"click li" : "do_select",
    do_select: function(e) { ... },
    render: function() {
	var ul=$(this.el);
        return this;

jQuery(document).ready(function() {
     var myView=new MyApp.Views.ThingView();

Collections are provided too. They come with a large number of iteration functions (map, filter, reduce, all that stuff) which makes them really rather useful, and you can build Views of them in much the same way as you can build views of models.

(To complete the completion, there’s also a Router, which is an interface for monkeying around with the URL so you can build bookmarkable client-side apps. But I haven’t had to use that yet)

Anyway. As you see in the example above, the view can also take a hash of events which is registered with jQuery using its delegate method. In this case we’re asking to have do_select called whenever a click event is received on any li element inside it. Great!

Not so great when it unexpectedly doesn’t work, though. Specifically, jQuery drag/drop events don’t work with jQuery’s delegate method, and there’s nothing in the documentation on either page to stop you wasting an afternoon finding this out. Way to go. For more details on just how much hysterical raisins mess is involved with jQuery event handlers, see the pages for on and live – look upon these works, ye mighty, and despair.

backbone.js is here. There’s a Ruby gem for using it with Rails: add rails-backbone to your Gemfile, and you get a handy set of generators which write Coffeescript for you. (A brief inspection of the result says that this is a good thing because there’s no way on earth I’d want to write that stuff myself. But I concede, significant whitespace is a valid personal preference, just not one of mine.)

Syndicated 2012-01-29 22:04:06 from diary at Telent Netowrks

Making reload! work in Pry with Rails 3.2

As of Pry (the current version according to Bundler at the time I write this), the setup instructions for replacing irb with Pry when you run rails console no longer work fully in Rails 3.2. Specifically, the Rails team have changed the way they create irb commands like reload!: where in earlier version they were added to Object, now they are added to IRB::ExtendCommandBundle to avoid polluting the global namespace. Here’s the github pull request where the change is described.

(The cynic here will say “Rails? Namespace pollution? Lost cause, mate”, but hey, let’s not be down on attempts to make it better)

IRB already knows how to look in IRB::ExtendCommandBundle; Pry doesn’t, so what will happen if you have installed Pry in the usually recommended way by assigning IRB=Pry is you’ll get an error that Pry::ExtendCommandBundle doesn’t exist.

(I’ve seen ‘fixes’ for this bug that assign Pry::ExtendCommandBundle=Pry. This will make rail console start, but it still doesn’t make the commands accessible. Less than entirely useful then)

So, let’s make it. Here’s the relevant bit of my .pryrc: feel free to use as inspiration, but don’t blame me if copy/paste doesn’t work

if Kernel.const_defined?(“Rails”) then
  require File.join(Rails.root,“config”,“environment”)
  require ‘rails/console/app’
  require ‘rails/console/helpers’
  Pry::RailsCommands.instance_methods.each do |name| 
    Pry::Commands.command name.to_s do 

If you are using a newer version of Pry than me – well, first off, they may have fixed this already and if so you can ignore this whole post. But if they haven’t, and if Pry::Commands.command is giving you trouble, note that the unreleased Pry 0.9.8 is set to include a new way of defining custom commands and you may need to rewrite this using the new Pry::Commands.block_command construct instead.


Syndicated 2012-01-23 14:08:41 from diary at Telent Netowrks

Micro setup for minitest in rails

I think this is the bare minimum setup for being able to write Minitest::Spec tests in a Rails 3.1 app, and certainly a lot simpler than all that faffage with minitest-rails and stuff

  • add the line require 'minitest/spec' somewhere near the top of test/test_helper.rb
  • write tests that look something like this:
    require ‘test_helper’
    require ‘edition’
    describe Edition do
      it “ex nihilo nihil fit” do
  • we don’t create generators, but really, why do you need a generator to add one line of code? To disable the builtin Test::Unit generators – which you may as well because in this context they’re useless, add
    config.generators do |g|
      g.test_framework nil

inside YourApp::Application in config/application.rb

This is all pretty vanilla – it doesn’t do spork or any of the faster-testing-through-not-loading-the-framework stuff, but with those three simple steps you can run rake test:units just as you would with the default Test::Unit stuff. test:foo for other values of foo appears also to work, but I don’t have any integration tests in this project yet so don’t take my word for it.

Double Trouble

I can see no way in minitest to create a partial mock: viz. a real object with some but not all methods mocked. In fact I can see no documented way in minitest to do any kind of mocking at all. As the new fashionable Double Ruby (a.k.a. rr ) library scores highly on both these counts, I decided to use that too.

This is a simple matter of adding “rr” to the Gemfile, then amending test/test_helper.rb to include the lines

require ‘rr’
class MiniTest::Unit::TestCase
  include RR::Adapters::MiniTest

Syndicated 2012-01-21 18:27:16 from diary at Telent Netowrks

Objections on Rails

It seems to be pick-on-rails week/month/quarter again. I’m not sure that 3 months and two projects really should qualify me to have an opinion on it, but this is the internet, so here we go anyway.

Recent influences which have provoked these thoughts:

  • And Steve Klabnik’s blog in which he identifies the problem as ActiveRecord and/ or ActionController and/or ActionView and/or the whole concept of MVC. MVC. Webdevs, you keep using that word. I do not think it means what you think it means.
  • The veneer of delegating-domain-objects I created in my current $dayjob project, inspired by the writings above and about which this post was originally going to be until I realised just how much I was writing even explaining the problem. Look out for Part Two

I’m not going to lay into the V and C of Rails here: to be honest, although they seem rather unpretty (my controller’s instance variables appear magically in the view? ew) they’re perfectly up to the task as long as you don’t try to do the heavy lifting in either of those layers. Which is a bad idea anyway. And actually, after writing an entire site including the layouts in Erector (if you don’t know Erector, think Markaby), there is a certain pleasure in being able to write the mostly static HTML bits in … HTML.

No, instead of generalizing that MVC is the problem, I an going to confine myself to the ActiveRecord arena and generalize that M, specifically M built on ORM, is the problem. The two problems.

Here is the first problem. Object-orientated modelling is about defining the responsibilities (or behaviours, if you prefer) of the objects in your system. Object-relational modelling is about specifying their attributes. Except in the trivially simple cases (“an Elf is responsible for knowing what its name is”) the two are not the same: you define an Elf with a method which tells him to don his pointy shoes, not with direct access to his feet so you can do it yourself. So that’s the first problem: the objects you end up with when you design with ActiveRecord have accessors where their instance variables would be in a sensible universe.

(Compounding this they also have all the AR methods like #find wich in no way correspond to domain requirements, but really I think that’s a secondary issue: forcing you to inherit baggage from your ancestors is one thing, but this pattern actively encourages you to create more of it yourself. This is the kind of thing that drives Philip Larkin to poetry)

Here is the second problem. We’re wimping out on relations. For the benefit of readers who equate RDMBS with SQL with punishment visited on us by our forebears in 1960s mainframe data processing departments, I’m going to expound briefly on the nature of relational algebra: why it’s cool, and what the “object-relational impedance mismatch” really is. It’s not about the difference between String and VARCHAR

Digression: a brief guide to relational algebra

A relation is a collection of tuples. A tuple is a collection of named attributes.

(You can map these to SQL database terminology if you put your tuples in a grid with one column per attribute name. Then attribute=column, row=tuple, and relation=table. Approximately, at least)

An operation takes relation(s) as arguments and returns a relation as result. Operators are things like

  • select (a.k.a. restrict), which selects tuples from a relation according to some criteria and forms a new relation containing those selected. If you view the relation as a grid, this operation makes the grid shorter
  • project, which selects attributes from each tuple by name (make the grid narrower)
  • rename, which renames one or more attributes in each tuple (change the column titles)
  • set difference and set intersection
  • some kind of join: for example the cross join, which takes two relations A (m rows tall) and B (n rows tall) and returns an m*n row relation R in which for each row Ai in A there are n rows each consisting of all attributes in Ai plus all attributes in some row Bj in B. Usually followed by some kind of selection which picks out the rows where primary and foreign key values match, otherwise usually done accidentally.

Here’s an example to illustrate for SQL folk: when you write

select a,b,c from foo f join bar b on b.foo_id=f.id where a>1

this is mathematically a cross join of foo with bar, followed by a selection of the rows where b.foo_id=f.id, followed by a projection down to attributes a,b,c, followed by a selection of rows where a>1.

Now here’s the important bit:

the tuple isn’t in itself a representation of some real-world object: it’s an assertion that some object with the given attributes exists.

Why is this important? It makes a difference when we look at operations that throw away data. If Santa has a relation with rows representing two elves with the same name but different shoe sizes, and he projects this relation to remove shoe_size, he doesn’t say “oh shit, we can’t differentiate those two elves any more, how do we know which is which?”, because he doesn’t have records of two elves and has never had records of two elves – he has two assertions that at least one elf of that name exists. There might be one or two or n different elves with that name and we’ve thrown away the information that previously let us deduce there were at least two of them, but we haven’t broken our database – we’ve just deleted data from it. Relational systems fundamentally don’t and can’t have object identity, because they don’t contain objects. They record facts about objects that have an existence of their own. If you delete some of those facts your database is not screwed. You might be screwed, if you needed to know those facts, but your convention that a relation row uniquely identifies a real-world object is your convention, not the database’s rule.

(Aside: the relational algebra says we can’t have two identical rows: SQL says we can. I say it makes no difference either way because both rows represent the same truth and you have to violate the abstraction using internal row identifiers to differentiate between them)

Back in the room

The reason I’ve spent this expended all those words explaining the relational model instead of just saying “ActiveRecord has poor support for sticking arbitrary bits of SQL into the code” is to impress on you that it’s a beautiful, valuable, and legitimate way to look at the data. And that by imposing the requirement that the resulting relation has to be turned back into an object, we limit ourselves. Consider

  • As a present fulfillment agent, Santa wants a list of delivery postcodes so that he can put them in his satnav. Do you (a) select all the children and iterate over them, or (b) select distinct postcode from children where nice (he does the coal lumps in a separate pass)?
  • As a financial controller, Mrs Claus wants to know the total cost of presents in each of 2011, 2010 and 2009, broken down by year and by country of recipient, so that she can submit her tax returns on time.

We wave #select, #map and #inject around on our in-memory Ruby arrays like a Timelord looking for something to use his sonic screwdriver on. When it comes to doing the same thing for our persistent data: performing set operations on collections instead of iterating over them like some kind of VB programmer, why do we get a sense of shame from “going behind” the object layer and “dropping into” SQL? It’s not an efficiency hack, we’re using the relational model how it was intended.

And although we can do this in Rails (in fairness, it gets a lot easier now we have Arel and Sequel), I think we need a little bit of infrastructure support (for example, conventions for putting relations into views, or for adding presenters/decorators to them) to legitimise it.

Wrapping up

Summary: (1) our ORM-derived objects expose their internal state, and this is bad. (2) we don’t have good conventions for looking at our state except by bundling up small parcels of it and creating objects from them, and this is limiting us because sometimes we want to see a summary composed of parts of several objects. Summary of the summary: (1) exposing state is bad; (2) we can’t see all the state in the combinations we’d like.

Yes, I realise the apparent contradiction here, and no, I’m not sure how it resolves. I think there’s a distinction to be drawn between the parts of the sytem that allow mutation according to business requirements, and the “reporting” parts that just let us view information in different ways. I also think we’re putting behaviour in the wrong places, but that’s a topic for Part Three

If you have read all the way to the end, Part Two, “Objective in Rails”, will be a run through of my current progress (with code! honest!) for coping with with the first problem and parts of the second. Part Three will be a probably-quite-handwavey look at DCI and how it might provide a way of looking at things which makes both problems go away.

Syndicated 2012-01-05 18:01:28 from diary at Telent Netowrks

155 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!