Older blog entries for danstowell (starting at number 9)

9 Feb 2012 (updated 9 Feb 2012 at 12:41 UTC) »

What is a musical work? What is a performance of it?

Yesterday I went to a philosophy talk by Margaret Moore, on timbre and the ontology of music. I'd better say up front that I'm not a philosopher and I don't know the literature she was referring to. But I found it a frustrating talk - she was considering a position she calls "timbral sonicism" attributed to Julian Dodd, and asserting what she held to be problems with adding timbre (as well as pitch and duration) into the account of what a musical work can be, in terms of it being a normative description which a particular performance might or might not match.

I thought her argument had a couple of weird components in it: the dodgy assertion that there can never be a synthesiser whose sound was indistinguishable from that of a real instrument (unless the synth actually was functionally equivalent), and the requirement that a performance would have to match all dimensions of timbre (rather than just, say, the brightness dimension) in a performance before Dodd's inclusion of timbre as normative could make sense. But those problems are irrelevant for me because this "timbral sonicist" view is part of the "aesthetic empiricist" approach in which you have to claim that our evaluation of a music performance must only be done in terms of the sonic content of that performance. This is so clearly misguided that I don't see the point talking about it: this is the main reason I was frustrated. Music performances are so many and varied, and many other criteria come into our assessments - not only assessments of whether it was a good performance, but more importantly of whether it was indeed a performance of a particular work. We judge based on our own background and cultural expectations, we judge based upon what we see, on what we believe (e.g. whether the performers are humans or holograms).

But there are some interesting things in this philosophical consideration of the ontology of music, and it led me to think, so let me address one issue in my own way (with an uninformed disregard for any literature on the topic!):

This question is one that was floating about: What is a musical work? and more pertinently How do we judge whether a particular performance is indeed an instantiation of a particular musical work?

For me there are two really important components to answer this:

  1. The concept of "a musical work" only has meaning in some musical traditions, e.g. Western classical or Western pop. In other traditions (e.g. free improv, raga, and I think gamelan) the abstract structures that give form to a musical act have different granularities, and are brought to bear in different combinations.

  2. As Moore said, a musical work can be described as an abstract "sound structure" or a "normative type". The latter is Moore's preferred, and I think she draws some difference between those two, though I can't be sure what the exact differences are. I think the idea of a musical work as a normative type is a useful one, and it reminds me strongly of the idea of an abstract class or abstract type in object-oriented programming: a composer might specify a particular series of notes, for example, and not bother to specify every note's timbre, or not bother to specify which instrument must be used, so we consider it an incomplete specification. The specification is fuzzy as well as incomplete: a composer might specify "getting faster" but not exactly how much.

So in my way of thinking, putting these two points together, a musical work is not special: other abstract things that can be instantiated in a performance (genres, cliches, keys) are the same kind of normative type, and they don't have to sit in a hierarchical relationship to each other. Musical works don't have special status in general, but are a bundle of normative constraints which have a particular granularity that we are used to in Western music.

To say a musical performance is an instance of a particular musical work, then, we check if the constraints are satisfied. We'd need to allow for errors (a few constraints not met, a few constraints sort-of-met) - our tolerance depends on our expectations (maybe we tolerate timbre deviations more readily than pitch deviations, in a particular tradition; maybe we tolerate wider deviations in a school band than a professional orchestra). Criteria should also depend on context in the form of the background corpus - are enough contraints met that we can positively say this is a performance of work A and not of another work B?

But again, to describe it as work A vs work B is only really relevant in the Western idea of a "musical work", in which the piece (e.g. the sequence of notes) is so tightly specified that it's generally only ever a realisation of one work. In other situations, a performer might simultaneously be performing two traditional Irish tunes, woven in and out of each other, and that's the way these tunes are expected to be treated: the result is not a bastardised new work but a simultaneous realisation of two known normative types.

I must also state explicitly that I don't believe for a second that such normative types must only ever include acoustic or psychoacoustic properties (which is the line Moore was sticking to in her talk - whether to criticise it from within, or whether she believes it, I don't know). In some traditions in may be explicit or implicit that a work can only be played on a piano and not on a synthesiser: that's a constraint about the means of production, not about the sound that is produced. Our choice of how strongly to attend to that part of the specification affects our judgment of whether a particular performance counts as an instantiation of a particular work. But there is no a priori way to know what balance of judgments is correct: constraints are always fuzzy (was that definitely a C#, or was it slightly flat?) and pretty much any normative description of musical structure is under-specified.

In this view, pitch, timbre, rhythm, duration, instrumentation, lyrics, and potentially other stuff such as the performer's clothing all have the same status: they are examples of things that in the Western tradition are specified to a greater or lesser extent at the level of a "musical work". (Note that there's not much limit to what might be specified: in raga, the time of day is specified, though that idea might be a surprise to many Western listeners.) And musical works have the same status as genres, cliches, motifs etc, as bundles of constraints which I hope fit Moore's term "normative types". These constraints are brought to bear in what a performer chooses to do in a given performance, and also brought to bear by observers in deciding if it really was "a good/faithful rendition of the piece" or "a trad jazz show".

So is there a use for this? I can't speak for the philosophers, but in Music Information Retrieval I'm reminded of the task of "cover song identification", i.e. determining automatically if a recording is an instantiation of a particular piece (which might be represented as score, or might be represented as a reference recording). All too often, this task is reduced depressingly quickly to the question of whether the melody or chord sequence matches sufficiently. This is an impoverished idea of the "cover song" and fails badly for many widespread genres - an obvious one is hip-hop, but also much club music.

If it were possible, I'd like to imagine a system which does something like "cover song identification" by identifying from a wide number of potential dimensions the specific constraints that a musical work represents, over and above the constraints of any assumed background such as genre or common corpus of known works. It would then use these constraints to identify matching instances. In order to do this usefully, it would need to identify enough constraints that distinguish a work from other candidate works, but would need to leave enough dimensions free (or loosely specified) to allow interpretative variation. What can be held fixed, and what can be allowed to vary, clearly depends on musical tradition, so the context for such an inference would need to be aware not just of a corpus of musical work but probably some cultural parameters that couldn't be inferred directly from audio, no matter how much audio is available.

Syndicated 2012-02-09 04:36:07 (Updated 2012-02-09 07:00:37) from Dan Stowell

1 Feb 2012 (updated 1 Feb 2012 at 11:54 UTC) »

New musical instrument design: ownership and hackability

We had an interesting conversation here yesterday about designing new musical instruments. We're interested in new instruments and interfaces, and there's quite a vogue for "user-centred design", "experience design" and the like. But Andrew McPherson pointed out this paper by Johan Redstrom with an interesting critique of this move, essentially describing it as "over-specifying" the user. If we focus too much on design for a particular modelled user experience, we run the risk of creating tools that are tailored for one use but aren't repurposable or don't lend themselves to whole "new" forms of musical expression.

The twentieth century alone is littered wth examples of how it's only by repurposing existing technologies that new music technology practices come about. Here's a quick list:

  • The Hammond organ was meant to be used in churches as a cheap pipe-organ alternative, but it really took off when used in R&B, rock and so on.
  • The mixing desk is widely used as intended, of course, but it unexpectedly became a musical instrument in the hands of dub reggae people like King Tubby and Lee Scratch Perry.
  • The saxophone (I didn't know this) was apparently intended to have a consistent timbre over a wide pitch range - it wasn't intended for the throaty sounds we often recognise it for these days, and which earned it a firm position in jazz. (OOPS the sax was pre-20th century, my mistake - it doesn't strictly belong on this list.)
  • The vinyl turntable famously wasn't designed to be scratched, and we all know what happened with that in hip-hop and beyond.
  • The development of the electric guitar was clearly driven by the desire simply to make a normal guitar, but amplified. Hendrix and others of course took that as a starting point and went a long way from the acoustic sound.
  • The TB-303 was supposed to be a synth that sounded like a bass guitar. Turn its knobs to high-gain and you get those tearing filter sounds that made acid house. (Indeed it was discontinued before it got really famous, showing just how unexpected that was...)
  • The microphone led to a number of changes in vocal performance style (for example, it allowed vocalists to sing quietly to large audiences rather than belting). The most obvious repurposing is the sophisticated set of mic techniques that beatboxers use to recreate drum/bass/etc sounds.
  • 1980s home computers had simple sound-chips only capable of single sounds. But pioneers like Rob Hubbard broke through these constraints by inventing tricks like the "wobbly-chord", and created a rich genre of 8-bit (and 16-bit) music whose influence keeps spreading.
  • AutoTune was supposed to subtly make your voice sound more in-tune. But ever since the Cher effect, T-Pain et al, many vocalists push it to its limits for a deliberately noticeable effect.

The only successful twentieth-century musical instrument I can think of, that was successful through being used as the designer intended, is the Theremin! (Any others? Don't bother with recent things like the ReacTable or the Tenori-On, they're not widespread and might well be forgotten in a few years.)

So, given this rich history of unexpected repurposing (kinda reminiscent of the fact that you can't predict the impact of science) - if we are designing some new music interface/instrument, what can we do? Do we go back to designing intuitively and for ourselves, since all this user-centred stuff is likely to miss the point? Do we just try building and selling things, and seeing what takes off?


One important factor is hackability. There's quite a telling contrast (mentioned in the Redstrom paper) between the "consumer" record player and the "consumer" CD player - in the latter, the mechanisms are quite deliberately hidden away and all you have is a few buttons. The nature and size of vinyl makes that a bit difficult, so most record players have the mechanism exposed, and it's this exposed mechanism that got repurposed by scratch DJs.

(There are people doing weird things with CD players, and hacked CD players are relevant to the glitch aesthetic in digital music. But maybe if the mechanism was more exposed, more people would have come up with more and crazier things to do with them? Who can say.)

But it's not neccessarily a good thing to expose all the mechanism. In digital technology this could end up leading to too-many-sliders and just poor usability.

(Another relevant paper on this topic: Thor Magnusson's "Affordances and constraints" paper, considering how users approach music technologies and their constraints.)

In a paper I wrote with Alex McLean (extended version coming soon, as a book chapter), we argue that the rich composability of grammatical interfaces (such as programming languages) is one way to enable this kind of unbounded hackability without killing usability. Programming languages might not seem like the best example of an approachable musical environment that musicians can fiddle around with, but the basic principle is there, and recent work is making engaging interfaces out of things that we might secretly call programming (e.g. Scratch or the ReacTable).


Another factor which is perhaps more subtle is ownership - people need to take ownership of a technology before they invest creative effort in taking it to new places. There was some interesting discussion around this but I personally haven't quite pinned this idea down, though it's obvious that it's important.

For inventors of instruments/interfaces this is quite a tricky factor. Often new interfaces are associated with their inventor, and the inventor generally likes this... Also it's rare that the instrument gets turned into a form (e.g. a simple commercial product) that people can easily take home, live with, take to gigs, etc etc, all without reference to the original inventor or the process of refining original designs etc.

I don't even think I've really pinpointed the ownership issue in this little description... but I think there is something to it.

Syndicated 2012-02-01 05:36:55 (Updated 2012-02-01 06:36:10) from Dan Stowell

Fish and chorizo stew with pasta

Fish and chorizo is a lovely combination and this stew with pasta shells was simple but rich. Serves 2-3, takes about 40 minutes.

  • 320g fish pie mix (mine had haddock, salmon, pollack)
  • 55g chorizo
  • 2 spring onions
  • Lemon zest (about 1/4 of a lemon's worth)
  • Handful parsley
  • Small handful conchiglie (pasta shells)

Chop the spring onions up. Keep the whiter bits separate from the green leafy bits. Also chop the chorizo into little bite-sized pieces.

In a deep pan that has a well-fitting lid, warm up some marge/oil and start the white bits of the spring onion frying gently. Once the chorizo is chopped up add that too.

Once the chorizo and spring onion have softened a bit nicely, add the fish pie mix to the pot and stir it around. Add the green bits from the spring onions, and the lemon zest, then enough boiling water to only just cover. Bring to the boil, put the lid on, turn the heat right down and let it simmer very gently for 25-35 minutes.

Halfway through the stew's bubbling time, get the pasta going. Half-cook it (parboil it) for 5 minutes in boiling water, then drain it and add it into the stew.

Just near the end, wash and chop up the parsley, add it into the pot, and stir everything around. Give it another minute or two for the parsley to get involved, then serve.

Syndicated 2012-01-28 08:59:17 from Dan Stowell

Learning prolog, eight queens

I'm following the "7 languages in 7 weeks" book. This week, PROLOG! However, I'm failing on this task: solve the eight queens puzzle in prolog. Why does this fail:

      queens(List) :-
            List = [Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8],

    valid([Head|Tail]) :-

    validone(One,[Head|[]]) :-
            pairok(One, Head).
    validone(One,[Head|Tail]) :-
            pairok(One, Head),
            validone(One, Tail).

    pairok((X1, Y1), (X2, Y2)) :-
            Range = [1,2,3,4,5,6,7,8],
            member(X1, Range),
            member(Y1, Range),
            member(X2, Range),
            member(Y2, Range),
            (X1 =\= X2),
            (Y1 =\= Y2),
            (X1+Y1 =\= X2+Y2),
            (X1-Y1 =\= X2-Y2).

I load it in gprolog using


then I ask it to find me the eight unknowns (A through to H) by executing this:


What it should do (I think) is suggest a set of values that the unknowns can take. What it does instead is say:


(which means it thinks there are no possible solutions.) Anyone spot my error?

Syndicated 2012-01-19 17:57:30 from Dan Stowell

isobar python pattern library

One of the nicest things about the SuperCollider language is the Patterns library, which is a very elegant way of doing generative music and other stuff where you need to generate event-patterns.

Dan Jones made a kind of copy of the Patterns library but for Python, called "isobar", and I've been meaning to try it out. So here are some initial notes from me trying it for the first time - there may be more blog articles to come, this is just first impressions.

OK so here's one difference straight away: in SuperCollider a Pattern is not a thing that generates values, it's a thing that generates Streams, which then generate values. In isobar, it's not like that: you create a pattern such as a PSeq (e.g. one to yield a sequence of values 6, 8, 7, 9, ...) and immediately you can call .next on it to return the values. Fine, cutting out the middle-man, but I'm not sure what we're meant to do if we want to generate multiple similar streams of data all coming from the same "cookie cutter".

For example in SuperCollider:

        p = Pseq([4, 5, 6, 7]);
      q = p.asStream;
      r = p.asStream;
      r.next;  // outputs 4
      r.next;  // outputs 5
      q.next;  // outputs 4
      q.next;  // outputs 5

and in isobar it looks like we'd have to do:

        q = PSeq([4, 5, 6, 7])
      r = PSeq([4, 5, 6, 7])
      r.next()  # outputs 4
      r.next()  # outputs 5
      q.next()  # outputs 4
      q.next()  # outputs 5

Note how I have to instantiate two "parent" patterns. (I could have cached the list in a variable, of course.) It looks pointless with such a simple example, who cares which of the two we do. But I wonder if this will inhibit the pattern-composition fun in isobar, that you can do in SuperCollider by putting patterns in patterns in patterns... who can say. Will dabble.

The other thing that was missing is Pbind, the bit of magic that constructs SuperCollider's "Event"s (similar to Python "dict"s).

As a quick test of whether I understood Dan's code I added a PDict class. It seems to work:

        from isobar import *
      p = PDict({'parp': PSeq([4,5,6,7]), 'prep': PSeq(['a','b'])})

      p.next()   # outputs {'prep': 'a', 'parp': 4}
      p.next()   # outputs {'prep': 'b', 'parp': 5}
      p.next()   # outputs {'prep': 'a', 'parp': 6}
      p.next()   # outputs {'prep': 'b', 'parp': 7}
      p.next()   # outputs {'prep': 'a', 'parp': 4}

This should make things go further - as in SuperCollider, you should be able to use this to construct sequences with various parameters (pitch, filter cutoff, duration) all changing together, according to whatever patterns you give them.

There's loads of stuff not done; for example in SuperCollider there's Pkey() which lets you cross the beams - you can use the current value of 'prep' to decide the value of 'parp' by looking up its current value in the dict, whereas here I'm not sure if that's even going to be possible.

Anyway my fork of Dan's code, specifically the branch with PDict added, is at:


Syndicated 2012-01-08 16:20:12 (Updated 2012-01-08 16:25:02) from Dan Stowell

Four Alls Inn, Higham

The Four Alls Inn, the pub in Higham where I grew up, has just re-opened with a bit more of a focus on food than before. So I thought it worth giving the food a bit of a write-up.

Photo of the Four Alls
Photo CC-BY-SA Neil Clifton

The photo above doesn't show it since the refurbishment but it does show the original sign that illustrates what the "Four Alls" actually means (see closeup here). The original sign is preserved of course for historic interest.

Inside, they've got some tasteful new upholstery and carpet, but of course it's still a fairly small place with about 7 tables in the main area (apparently it's sometimes been difficult to get a table booking - everyone's been trying it since the relaunch). They've still got decent local ale on tap (Moorhouse's Pride of Pendle, recommended) and an open fire.

Of course I had to try the black pudding starter. A single slice of black pudding but perfectly done and served with a poached egg and some delicious mustard mash. The mustard mash was excellent, and the poached egg was cooked just right (though it had cooled a bit by the time it got to me).

(My dad thought one slice of black pud wasn't enough, but in combination with the mustard mash and the egg I think it's the right balance. If there's one thing that a food place can do to disappoint me, it's cock up the black pudding starter! So I'm glad to report they've done a good job with it...)

For main course, I was definitely tempted by the butternut and ricotta ravioli but one of my sisters ordered that, so instead I had the steak and ale pie, and snaffled a taste of the ravioli. The pie was great, really tender meat; and the ravioli was also lovely - the pasta perhaps a little thick, and perhaps swimming in a bit much sauce, but the filling was very nicely flavoured, and overall my sis said it was lovely. My other sister had the cheese and onion pie and grandma had the chicken, both of which were apparently good.

For afters, the sticky toffee pudding was fine, as it should be; and the cheesecake was "alright" apparently (not very strongly flavoured - not always a bad thing IMHO, but then I didn't actually sample the cheesecake).

Everyone in this area knows that the Fence Gate just down the road has claimed a massive slice of the gastropub territory round here. (And justifiably so, it has some really good food.) So it's nice to report that the Four Alls has good food worth the mention. There's no reason that all pubs should be gastropubs, of course, but the Four Alls was having trouble staying open as it was, so it'd be good to see it develop in this slightly different direction. Since there's a whole new set of commuter-village houses being built next door to it, it seems like a canny move. Oh and just so you know, they've still got the pool table in the little room.

Syndicated 2011-12-28 12:34:35 (Updated 2011-12-28 12:41:22) from Dan Stowell

The Impact agenda, and public engagement

I was at a meeting recently, going through research proposal documents, and I realised that the previous government's "impact agenda" might be having an unintended effect on public engagement:

One of the things that has happened in research in the past few years is that the government now demands that we now have to state what kind of "impact" our research will have. Now, the problem is that impact is notoriously and demonstrably unpredictable - we don't know if we're going to discover anything world-changing, until we actually try it, and even then we might not realise the impact for decades - but the previous government wanted to try and pin it down somehow.

So every proposal now (in the UK) has to have a two-page "Pathways to Impact" summary. If you're doing applied research it's pretty easy - you say things like "We're going to study the resilience of welded grommets under pressure, which means the grommet industry will produce more reliable grommets and there will be fewer grommet-related fatalities." In you're doing theoretical or basic research, in principle you still have a story to tell: you say something like "Our research will lead to a greater understanding of the number five, which is widely used in the natural sciences, industry and the financial sector. Future researchers will be able to build on these theoretical advances to develop new techniques for counting grommets or whatever."

So, in theory every research project has something they can say about this. (And they don't have to fill up the two pages, if they don't have much to say.) But that's not what happens.

Here's a very rough transcript of a conversation that went on in the meeting:

P: "Your proposal is good, Q, but there's not really anything about impact. The reviewers will have to rate you on impact so you need to say something here."

Q: "Oh blooming heck, but it's basic research, you can't really say what the impact is. I suppose I'll have to stick a schools talk in or something?"

R: "I know a couple of schools, I can arrange for you to do a talk, put that in."

Q: "Yeah OK."

Now I want to emphasise, this was not the end of the conversation. But I'm in favour of public engagement - perhaps a little more imagination is needed than just some generic schools talk, but it's interesting to see that this criterion is pushing people towards that little bit more public engagement.

Also: this is not a particularly unusual approach to filling in those impact pages. Impact is not supposed to be the tail that wags the dog, research excellence is supposed to be the number one criterion. But there are two whole pages which we have to use to say something about impact. And we know that the reviewers have got to read those pages, and rate us in terms of how strong or weak our pathways to impact are.

As I've said, impact is unpredictable. So what can you write, to make a reviewer say, "Yep, that's credible"? Your biggest impact might be to invent a whole new type of science, or to change the way we all think about the universe, but that won't happen for decades and it depends on a whole vague network of people taking your research and running with it. Can you talk about that? You could do, and that might be the truth about the likely impact of the research. But we know we'll get a bigger tick if we have something demonstrable that we can actually propose to do - even if it's not really connected with the research's biggest likely impact on society. A schools talk is a good thing to do, but is it the biggest impact your research will have on society in general? I hope not!

So, it happens quite often that people conflate public engagement with impact. A schools talk is not impact. An article in a newspaper is not impact. They might be tools that help spread research out of the university into the wider world, and they might faciliate impact, but they're not really the point of the hurdle that the government set for us.

Unfortunately, in science - unlike in politics - we formally review each others' work, and we can't hide behind wooly generalities. The strange thing is that regarding impact, the wooly generalities are the truth.

Syndicated 2011-11-22 13:59:00 (Updated 2011-11-22 14:06:07) from Dan Stowell

Baked leeks with lardons and feta

A lovely warming autumn dish. You'll need a casserole dish big enough that the leeks (chopped into a couple of pieces each) can all sit flat. Serves two as a main course (or 3--4 as a side).

  • 3 medium leeks
  • 80g lardons (or you can probably use any bacon, preferably thick and cut into cubes with scissors)
  • 1 pt (300ml) milk
  • small amount of feta cheese (60--100g?)
  • 40g plain flour
  • 40g margarine (or butter)

Preheat the oven to 180 degrees. Melt the marge, half each in two separate pans. One of them will be for making the white sauce. Put the lardons into the other one, on a low heat, just so they warm up and fry a tiny bit and flavour the marge.

Wash the leeks and prepare them for your casserole dish. Chop them into two or three pieces, as needed, and tile them into the bottom of the casserole dish so they form a single layer.

In the other pan, on a medium heat, start to make the white sauce. Sprinkle about half of the flour into the pan, and whisk it until smooth. The lardons in the other pan should have had a few minutes to warm up - turn the heat off for them, and pour the juices from the pan into the one where you're making the sauce. The idea is to get some of the bacony flavour into the sauce.

Put the lardons to one side. Put the rest of the flour into the sauce pan, and whisk again until smooth. Now continue to cook this "roux" for a couple of minutes, so the flour is cooked, then gradually add the milk (with whisking) and continue to cook for another couple of minutes.

Now assemble. You've already got the leeks in the bottom of the casserole dish; sprinkle the lardons over them, then gently pour the sauce evenly over. Finally crumble the feta on top. Cook in the preheated oven for about 40--45 minutes, until nicely browned on top. Serve with salad.

Syndicated 2011-11-13 08:07:57 from Dan Stowell

Roast pumpkin and aubergine spaghetti

This is a nice way to use pumpkin, a spicy and warming pumpkin pasta dish. These quantities serve 2; takes about 45 minutes in total, with some spare time in the middle.

  • 1/2 a pumpkin
  • 1 medium orange chilli
  • 1 tsp paprika
  • 1/2 tsp turmeric
  • Plenty of olive oil
  • 1/2 an aubergine
  • 2 tomatoes
  • Spaghetti

Put the oven on hot, about 210--220 C. Peel and deseed the pumpkin, and slice it into slices about 1/2 cm thick and 4 or 5 cm long - no need to be exact, but we want thinnish pieces. Chop the chilli up into rings too.

In a roasting tin, put a good glug of olive oil, then the pumpkin and chilli. Sprinkle over the paprika and turmeric, then toss to mix. Put this in the oven and let it roast for about 40 minutes, preparing the aubergine and pasta in the mean time.

The aubergine needs to be cut into pieces of similar size and shape to the pumpkin. The tomatoes, leave them whole but cut out the stalky bit. Halfway through the pumpkin's cooking time, add the aubergine, another glug of olive oil, toss briefly to mix, and sit the tomatoes in the middle somewhere, then put it all back in the oven.

Cook the spaghetti according to the packet instructions (e.g. boil for 15 minutes). Drain it, and get the other stuff out of the oven. In the pan that you used for the pasta (or a new pan), put the two roasted tomatoes and bash them with a serving spoon so they fall apart and become a nice lumpy paste. Add the pasta to them and mix. Then add the other roast vegetables, and mix all together, but gently this time so you don't mush the veg.

Serve with some parmesan perhaps.

Syndicated 2011-10-30 15:14:49 from Dan Stowell

ISMIR 2011: the year of bigness

I'm blogging from the ISMIR 2011 conference, about music information retrieval. One of the interesting trends is how a lot of people are focusing on how to scale things up, to handle millions of audio files (or users, or scores) rather than just hundreds or thousands. Why? Well, in real-world applications it's often important: big music services like Spotify and iTunes have about 15 million tracks, Facebook has millions of users, etc. In ISMIR one of the stars of the show is the Million Song Dataset, just released, which should help many many researchers to develop and test on a big scale. Here I'm going to note some of the talks/posters I've seen with interesting approaches to scalability:

Brian McFee described a simple tweak to the kd-tree data structure called "spill tree" which improves approximate search. Basically, when you split the data in two you allow some of the data points to spill over and fall on both sides. Simple but apparently effective.

Dominik Schnitzer introduced a nice way to smooth out a search space and reduce the problem of hub-ness. One way to do it could be to use a minimum spanning tree, for example, but that involes a whole-dataset analysis so it might not scale well. In Dominik's approach, for each data point X you find an estimate of what he calls "mutual proximity": randomly sample 100 data points from your dataset and measure their distance to X, then fit a gaussian to those distances. Then to find the "mutual proximity" between two data points X and Y, you just evaluate X's gaussian at Y's location to get a kind of "probability of being a near neighbour". He also makes this a symmetric measure by combining the X->Y measure with the Y->X measure, but I'd imagine you don't always need to do that, depending on your purpose. The end result is a distance measure that pretty much eliminates hubs.

Shazam's music recognition algorithm, described in this 2006 paper, is one of the commercial success stories of scalable audio MIR. Sebastien Fenet tweaked it to be robust to pitch-shifting, essentially by using a log-frequency spectrogram and using delta-log-frequency rather than frequency in the fingerprints.

A small note from the presentation of the Million Song Dataset: apparently if you want a good online linear-predictor than is fast for large data, try out Vowpal Wabbit.

Also, Thierry mentioned that he was a fan of using Amazon's cloud storage/processing - if you store data with Amazon you can run MapReduce jobs over it easily, apparently. Mark Levy of last.fm is also a fan of MapReduce, having done a lot of work using Hadoop (Yahoo's implementation of MapReduce) for big data-crunching jobs.

Mikael Henaff presented a technique for learning a sparse spectrum-derived feature set, similar in spirit to KSVD. The thing I found interesting was how he then made a fast way of decomposing a new signal (once you've derived your feature basis from some training data). Ordinarily you'd have to do an optimisation - the dictionary is overcomplete so it can't be done as easily as an orthogonal transform. But you don't want to do that on a lot of data. Instead, he first trains a nonlinear projection which approximates that decomposition (it's a matrix rotation followed by a shrinkage nonlinearity, really simple mathematically). So you have to train that, but then when you want to analyse new data there's no optimisation needed, you just apply the simple transform.

There's been plenty of interesting stuff here at ISMIR that isn't about bigness, and it was good of Douglas Eck (of Google) to emphasise that there are still lots of interesting and important problems in MIR that don't need scalability and don't even benefit from it. But there are interesting developments in this area, hence this note.

Syndicated 2011-10-27 23:06:26 (Updated 2011-10-28 12:58:43) from Dan Stowell

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!