Older blog entries for danstowell (starting at number 27)

13 Jul 2012 (updated 17 Jul 2012 at 12:11 UTC) »

A very simple toy problem for matching pursuits

To help me think about how and why matching pursuits fail, here's a very simple toy problem which defeats matching pursuit (MP) and orthogonal matching pursuit (OMP). [[NOTE: It doesn't defeat OMP actually - see comments.]]

We have a signal which is a sequence of eight numbers. It's very simple, it's four "on" and then four "off". The "on" elements are of value 0.5 and the "off" are of value 0; this means the L2 norm is 1, which is convenient.

  signal = array([0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0])

Now, we have a dictionary of 8 different atoms, each of which is again a sequence of eight numbers, again having unit L2 norm. I'm deliberately constructing this dictionary to "outwit" the algorithms - not to show that there's anything wrong with the algorithms (because we know the problem in general is NP-hard), but just to think about what happens. Our dictionary consists of four up-then-down atoms wrapped round in the first half of the support, and four double-spikes:

  dict = array([
   [0.8, -0.6, 0, 0, 0, 0, 0, 0],
   [0, 0.8, -0.6, 0, 0, 0, 0, 0],
   [0, 0, 0.8, -0.6, 0, 0, 0, 0],
   [-0.6, 0, 0, 0.8, 0, 0, 0, 0],
   [sqrt(0.8), 0, 0, 0, sqrt(0.2), 0, 0, 0],
   [0, sqrt(0.8), 0, 0, 0, sqrt(0.2), 0, 0],
   [0, 0, sqrt(0.8), 0, 0, 0, sqrt(0.2), 0],
   [0, 0, 0, sqrt(0.8), 0, 0, 0, sqrt(0.2)],
]).transpose()

BTW, I'm writing my examples as very simple Python code with Numpy (assuming you've run "from numpy import *"). We can check that the atoms are unit norm, by getting a list of "1"s when we run:

  sum(dict ** 2, 0)

So, now if you wanted to reconstruct the signal as a weighted sum of these eight atoms, it's a bit obvious that the second lot of atoms are unappealing because the sqrt(0.2) elements are sitting in a space that we want to be zero. The first lot of atoms, on the other hand, look quite handy. In fact an equal portion of each of those first four can be used to reconstruct the signal exactly:

  sum(dict * [2.5, 2.5, 2.5, 2.5, 0, 0, 0, 0], 0)

That's the unique exact solution for the present problem. There's no other way to reconstruct the signal exactly.

So now let's look at "greedy" matching pursuits, where a single atom is selected one at a time. The idea is that we select the most promising atom at each step, and the way of doing that is by taking the inner product between the signal (or the residual) and each of the atoms in turn. The one with the highest inner product is the one for which you can reduce the residual energy by the highest amount on this step, and therefore the hope is that it typically helps us toward the best solution.

What's the result on my toy data?

For the first lot of atoms the inner product is (0.8 * 0.5) + (-0.6 * 0.5) which is of course 0.1.
For the second lot of atoms the inner product is (sqrt(0.8) * 0.5) which is about 0.4472.

To continue with my Python notation you could run "sum(dict.T * signal, 1)". The result looks like this:

  array([ 0.1,  0.1,  0.1,  0.1,  0.4472136,  0.4472136,  0.4472136,  0.4472136])

So the first atom chosen by MP or OMP is definitely going to be one of the evil atoms - more than four times better in terms of the dot-product. (The algorithm would resolve this tie-break situation by picking one of the winners at random or just using the first one in the list.)

What happens next depends on the algorithm. In MP you subtract (winningatom * winningdotproduct) from the signal, and this residual is what you work with on the next iteration. For my purposes here it's irrelevant: both MP and OMP are unable to throw away this evil atom once they've selected it, which is all I needed to show. There exist variants which are allowed to throw away dodgy candidates even after they've picked them (such as "cyclic OMP").

NOTE: see the comments for an important proviso re MP.

Syndicated 2012-07-13 10:37:44 (Updated 2012-07-17 07:25:59) from Dan Stowell

13 Jul 2012 (updated 15 Jul 2012 at 11:14 UTC) »

A* orthogonal matching pursuit - or is it

The delightful thing about the A* routing algorithm is that it is provably the optimal algorithm for the purpose, in the sense that it's the algorithm that visits the fewest possible path nodes given the information made available. See the original paper for proof. Despite its simplicity, it is apparently still used in a lot of industrial routing algorithms today, and it can be adapted to help solve other sorts of problem.

A colleague pointed out a paper about "A*OMP" - an algorithm that performs a kind of OMP (Orthogonal Matching Pursuit) with a tree search added to try out different paths towards fitting a good sparse representation. "Aha," I thought, "if they can use A* then they can get some nice recovery properties inherited from the A* search."

However, in reading the paper I find two issues with the A*OMP algorithm which make me reluctant to use the name "A*" for it:

The heuristics used are not "consistent" (this means you can't guarantee they are always less-than-or-equal to the true distance remaining). This lack of consistency means the proof of A*'s optimality doesn't apply. (Remember, A*'s "optimality" is about the number of nodes inspected before finding the best path.) (EDIT: a colleague pointed out that it's actually worse than this - if the heuristic isn't consistent then it's not just sub-optimal search, it may fail to inspect the best path.)
Since A*OMP restricts the number of paths it adds (to the top "B" atoms having largest cross-product with the residual) there are no guarantees that it will even inspect the true basis.

These issues are independent of each other. If you leave out the pragmatic restriction on the number of search paths (to get round the second issue), the first issue still applies. OMP itself is greedy rather than exact, so this doesn't make A*OMP worse than OMP, but to my mind it's "not as good as A*".

In practice, the authors' A*OMP algorithm might indeed get good results, and the experiments shown seem to do so. So my quibbles may be mere quibbles. But the name "A*" led me to expect guarantees that just aren't there (e.g. guarantees of being better than OMP). It's quite easy to construct a toy problem for which A*OMP will not get you nearer the true solution than OMP will.

It's not obvious how to come up with a consistent heuristic. For a given problem, if we knew there was an exact solution (i.e. zero residual was possible within the sparsity constraints) then we could use the residual energy, but since we can't know that in general then the residual energy may overestimate the distance to be "travelled" to the goal.

One minor thing: their "equivalent path pruning" in section 4.2.3 is a bit overkill - I know a simpler way to avoid visiting duplicate paths. I'll leave that as an exercise for the reader :)

Syndicated 2012-07-13 06:38:06 (Updated 2012-07-15 06:50:08) from Dan Stowell

24 Jun 2012 (updated 24 Jun 2012 at 14:08 UTC) »

Rhubarb in the hole

I've been experimenting with rhubarb, looking at nice ways to cook it in the oven without stewing it first (so it keeps its shape). Here's a rather nice one: rhubarb in the hole.

These amounts serve 2. Should work fine if you double the amounts - you just need a roasting tin big enough to sit all the rhubarb in with plenty of space.

2 medium sticks rhubarb
5 or 6 heaped tbsp ginger preserve (ginger jam)
3 tsp caster sugar
Oil or marge for greasing
For the batter:
- 1 egg
- 1/2 cup (125 ml) milk
- 50 g (1/2 cup) plain flour
- Pinch of salt
- 3 tsp caster sugar

In a mixing jug, beat the egg then add the milk and beat again. Add the salt and 3 tsp sugar and beat again, then gradually whisk in the flour to make a medium-thick batter. (This is normal Yorkshire pudding batter but slightly sweetened.)

If you can, leave the batter to sit for about 30 mins, then whisk again.

Put the oven on at 200 C. Grease the roasting tin / Yorkshire-pudding tin and put it in the oven to preheat.

Rinse the rhubarb and cut it into 8cm (3in) pieces. On a chopping board or in a bowl, mix the rhubarb pieces with the ginger preserve to get them nice and evenly coated. Sprinkle the other 3 tsp sugar onto them.

Assembling the pudding can be a bit tricky - mainly because you want to work fast, so the tin stays hot. Get it out of the oven, then wobble it around to make sure the grease is evenly spread just before you pour in the batter. Then place the rhubarb pieces in (probably best to do it one-by-one) so they're evenly spread but they're all a good couple of centimetres away from the edge.

Immediately put this all back in the oven and bake for 20-25 minutes. The batter should rise and get a nice golden-brown crust.

You can serve it on its own, or with a small amount of ice cream maybe.

(I had expected it to be a bit dry on its own, but actually the fruit keeps it moist so the ice cream isn't compulsory.)

Syndicated 2012-06-24 09:07:36 (Updated 2012-06-24 09:30:22) from Dan Stowell

9 Jun 2012 (updated 22 Jun 2012 at 08:09 UTC) »

My research about time and sound

I've got three papers accepted in conferences this summer, and they're all different angles on the technical side of how we analyse audio with respect to time.

In our field, time is often treated in a fairly basic way, for reasons of tractability. A common assumption is the "Markov assumption" that the current state only depends on the immediate past - this is really handy because it means our calculations don't explode with all the considerations of what happened the day before yesterday. It's not a particularly hobbling assumption - for example, most speech recognition systems in use have the assumption in there, and they do OK.

It's not obvious whether we "need" to build systems with complicated representations of time. There is some good research in the literature already which does, and they have promising results. And conversely, there are some good arguments that simple representations capture most of what's important.

Anyway, I've been trying to think about how our signal-processing systems can make intelligent use of the different timescales in sound, from the short-term to the long-term. Some of my first work on this is in these three conference talks I'm doing this summer, each on a different aspect:

At EUSIPCO I have a paper about a simple chirplet representation that can do better than standard spectral techniques at representing birdsong. Birdsong has lots of fast frequency modulation, yet the standard spectral approaches assume "local stationarity" - i.e. they assume that within a small-enough window, we can treat the signal as unchanging. My argument is that we're throwing away information at this point in the analysis chain, information that for birdsong at least is potentially very useful.
At MML I have a paper about tracking multiple pitched vibrato sounds, using a technique called the PHD filter which has already been used quite a bit in radar and video tracking. The main point is that when we're trying to track multiple objects over time (and we don't know how many objects), it's suboptimal to just take a model that deals with one object and apply the model multiple times. You benefit from using a technique that "knows" there may be multiple things. The PHD filter is one such technique, and it lets you model things with a linear-gaussian evolution over time. So I applied it to vibrato sources, which don't have a fixed pitch but oscillate around. It seems (in a synthetic experiment) the PHD filter handles them quite nicely, and is able to pull out the "base" pitches as well as the vibrato extent automatically. The theoretical elegance of the filter is very nice, although there are some practical limitations which I'll mention in my talk.
At CMMR I have a paper about estimating the arcs of expression in pianists' tempo modulations. The paper is with Elaine Chew, a new lecturer in our group who works a lot with classical piano performance. She has had students before working on the technical question of automatically identifying the "arcs" that we can see visually in expressive tempo modulation. I wanted to apply a Bayesian formulation to the problem, and I think it gives pretty nice results and a more intuitive way to specify the prior assumptions about scale.

So all of these are about machine learning applied to temporal evolution of sound, at different timescales. Hope to chat to some other researchers in this area over the summer!

Syndicated 2012-06-09 07:18:44 (Updated 2012-06-22 03:52:09) from Dan Stowell

3 Jun 2012 »

Asparagus with garlic mayonnaise

Classic combination this, but when asparagus is in season it's a lovely light lunch to base around a handful of asparagus. Serves 2, takes 10 minutes, and NB it involves a raw egg so it's not for anyone preganant.

1 bunch fresh asparagus, rinsed
2 slices bread
1 egg
About 1/2 tsp garlic puree
About 1 tbsp white wine vinegar
About 1/2 tsp mustard powder
Olive oil
Small amount of pecorino or parmesan cheese

Put a griddle pan on a high heat to warm up. Put the bread in the toaster on a light setting. (The plan is to ignore when it pops up and leave it in the toaster while preparing the other stuff, which gets the bread a bit dry and crispy which complements the other stuff well.) Separate the egg, putting the yolk in a small mixing bowl, and put the white in the fridge/freezer (we don't need it). Shave/slice the cheese finely.

Put the asparagus in the hot griddle pan (lying perpendicular to the lines of the pan). Sprinkle a small amount of oil onto the asparagus. Let it cook for 5--7 minutes or so, turning halfway through, and meanwhile make the mayonnaise as follows.

Add the mustard powder, vinegar and garlic to the yolk, and mix it all in with a whisk. Add a tiny drop of olive oil and whisk that in until it's well blended. Add another bit of olive oil and whisk it in again. Keep doing this until you've added about 2 tbsp oil (after the first couple of drops, you can start adding slightly larger amounts at a time). Taste it, and adjust the flavour if wanted. Keep whisking and adding a dab more oil until it reaches a slightly gelatinous mayonnaisey consistency.

When the asparagus is almost ready, put the toaster down and let it warm the bread up for another 20 seconds or so. Then slice the toast in halves, and arrange each plate with toast and asparagus, sprinkling the cheese over the asparagus and then adding a blob-or-drizzle of mayonnaise over.

Syndicated 2012-06-03 05:56:08 (Updated 2012-06-03 05:58:43) from Dan Stowell

29 May 2012 »

Rustic green lentils

A simple lentil supper that happens to have a nice variegation of good strong flavours. Serves two, takes 40 minutes.

100g green lentils
3 tbsp olive oil
1 medium onion (chopped)
1 tsp cumin seeds
4 garlic cloves (chopped)
1 carrot (peeled and chopped in bitesize chunks)
1 pinch asafoetida
1 handful fresh thyme
3 mushrooms
Small handful beans e.g. aduki beans (optional)
1 handful fresh parsley

Put the lentils in a sieve and rinse them under running water.

Heat up 1 tbsp olive oil in a pan and start the onion and the cumin frying. After a couple of minutes, add the garlic. After another minute or two (maybe you're taking this time to chop the carrot) add the carrot and the asafoetida. Stir around and allow to fry a bit more. Overall, the onion etc should have been frying for 5 minutes or so before the next step.

Add enough boiling water to just cover what's in the pan, and let it boil strongly for a couple of minutes. Then turn the heat down to a simmer, bung the thyme in, and put a lid on.

This is now going to simmer for almost 30 minutes but halfway through we'll add the mushrooms. So about halfway through, heat up a frying pan with 1 tbsp olive oil in, peel and slice the mushrooms, and fry them quickly for a minute or so - no need to overdo it. Then bung the mushrooms in with everything else, and also the beans if you have any. Now is a good time to wash and chop up the parsley.

When the food is almost ready to serve (not much water left in the pan, and the lentils are almost soft, but still with a bit of a bite), add the parsley and 1 tbsp olive oil and stir through. Then serve.

Syndicated 2012-05-29 15:18:24 (Updated 2012-05-29 15:35:51) from Dan Stowell

28 May 2012 (updated 28 May 2012 at 20:09 UTC) »

Letter to my MP about the Long Range Acoustic Device (LRAD)

I'm concerned about the LRAD which is apparently being deployed during the Olympics. We citizens don't seem to have a way of knowing if it will be used for backup, general crowd control, or long-term hearing damage. So I wrote to my MP; if you live in London and care about your hearing, maybe write to yours. Here's what I wrote:

Dear [__________],

I'm an audio researcher living in [__________]. Because of my profession, my hearing is particularly important to me. So I'm a little concerned about the Long-Range Acoustic Device (LRAD) that the MoD has confirmed will be in use during the Olympics, and I'd like to ask you to seek some specific assurance please.

I'm aware that the device has two uses: one is to broadcast messages loudly, and one is as a non-lethal weapon which can damage hearing. The MoD has said the device is "primarily to be used in the loud hailer mode". I'm sure you can imagine that the use of the term "primarily" is a concern to me.

In a given situation, there is no way for anyone except the operator to discern which mode is being used or intended. If I am commuting through London during the Olympics and get caught up in something (a crowd situation of some sort), and an officer points the LRAD at me, I have no way of being reassured that there is no chance of permanent damage. This is why I'd like to ask you to seek very specific assurances.

Could you please press the Ministry of Defence to state specifically:

Is the LRAD intended to be deployed frequently, or in exceptional circumstances?
Under what conditions would the LRAD be used with settings that might endanger hearing?
What rank of officer would be required to authorise the use of LRAD with settings that might endanger hearing?
What safeguards are in place to ensure that the LRAD will not endager hearing, whether during correct use or operator error?

I hope you can help put these specific questions forward.

Best wishes, [__________]

Reference: http://www.guardian.co.uk/sport/2012/may/12/british-military-sonic-weapon-olympics

Syndicated 2012-05-28 14:59:52 (Updated 2012-05-28 15:12:45) from Dan Stowell

2 May 2012 »

Notes on how we ran the SuperCollider Symposium 2012

I've just uploaded my notes on how we ran the SuperCollider Symposium 2012 (10-page PDF). sc2012 was a great event and it was a privilege to work with so many great people in putting it together. I hope these notes are useful to future organisers, providing some detailed facts and figures to give you some idea of how we did it.

The document includes details of the timing of things, the budgeting, promotional aspects. I also include some notes about outreach, which I think is important to keep in mind. It's important for community-driven projects to bring existing people together, and to attract new people - and for something like SuperCollider which doesn't have any institution funding it and pushing it forwards, these international gatherings of users are 100% vital both for the existing users angle and the new users angle. Happily, both of these aims can be achieved by putting on diverse shows featuring some of the best SuperCollider artists in the world :)

Shout outs to all the other organisers, who put loads of their own time and enthusiasm in (see the "credits" page), and hi to everyone else I met at the symposium.

(If you weren't there, see also Steve's great photos of sc2012.)

Syndicated 2012-05-02 15:43:54 (Updated 2012-05-02 15:44:09) from Dan Stowell

1 May 2012 »

Why power to the people

What should you strive for?

Equal spread of power among all people.

Why? Three reasons, of which the third is the most important:

Morality: Equal power per person is fair.
Efficiency: Equal power is the most efficient way to make use of our combined human capacities.
Instability: Power begets power, which means that it tends to "clump" - equal spread of power is not a stable state. Thus we have to continually work towards it, rather than achieve it and then relax.

Syndicated 2012-05-01 09:05:50 from Dan Stowell

22 Apr 2012 »

How I made a nice map handout from OpenStreetMap

OpenStreetMap is a nice community-edited map of everything - and you can grab their data at any time. So in theory it should be the ideal thing to choose when you want to make a little map for an open-source conference or something like that.

screenshot of map handout. click for PDF

For our event this year I made these nice map handouts. It took a while! Quite tricky for a first-timer. But they're nice pretty vector PDF maps, with my own custom fonts, colour choices etc.

For anyone who fancies having a go, here's what I did:

I followed the TileMill "30 minute tutorial" to install and set up TileMill on my Ubuntu laptop. It takes longer than 30 minutes - it's still a little bit tricky and there's a bit of a wait while it downloads a lump of data too.
I started a new map project based on the example. I wanted to tweak it a bit - they use a CSS-like stylesheet language ("MSS") to specify what maps are supposed to look like, and it's nice that you can edit the stylesheets and see the changes immediately. However, I found it tricky to work out what to edit to have the effect I wanted. Here's what I managed to do:
- I changed the font choice to match the visual style of our website. That bit is easy - find where there are some fonts specified, and put your preferred font at the FRONT of all the lists.
- I wanted to direct people to specific buildings, but the default style doesn't show building names. However, I noticed that it does show names for cemeteries... in labels.mss on line 306 there was
```
    #area_label[type='cemetery'][zoom>=10] {
```
  and I can add buildings to that:
```
    #area_label[type='building'][zoom>=10], 
    #area_label[type='cemetery'][zoom>=10] {
```
- The underground train line was being painted on top of the buildings, which looks confusing and silly. To fix this I had to rearrange the layers listed in the Layers panel - drag the "buildings" layer higher up the list, above the "roads" ones.
When I'd got the map looking pretty good, I exported it as an SVG image.
Then I quit TileMill and started up Inkscape, a really nice vector graphics program. I load the SVG that I saved in the previous step.
I edited the image to highlight specific items:
- The neatest way to do this is to select all and put it all into a layer, then select the items you want to highlight and move them to a new layer above. Once they're in a separate layer, it's easier to use Inkscape's selection tools to select all these items and perform tweaks like thickening the line-style or darkening the fill colour.
- Selecting a "word" on the map is not so easy because each letter is a separate "object", and so is the shadow underneath. If there's a single word or street-name you're working on, it's handy to select all the letters and group them into a group (Ctrl+G), so you can treat them as a single unit.
- You can also add extra annotations of your own, of course. I had to add tube-station icons manually, cos I couldn't find any way of getting TileMill to show those "point-of-interest"-type icons. I think there's supposed to be a way to do it, but I couldn't work it out.
The next job is to clip the map image - the map includes various objects trailing off to different distances, it's not a neat rectangle. In Inkscape you can do a neat clipping like this:
- Select all the map objects. If you've been doing as I described you'll need to use "Select all in all layers" (Ctrl+Shift+A).
- Group them together (Ctrl+G).
- Now use the rectangle tool to draw a rectangle which matches the clipping area you want to use.
- Select the two items - the rectangle and the map-item-group - then right-click and choose "Set clip". Inkscape unites the two objects, using the rectangle to create a clipped version of the other.
Now with your neatly-cropped rectangle map, you can draw things round the outside (e.g. put a title on).
If you ever need to edit inside the map, Inkscape has an option for that - right-click and choose "Enter group" and you go inside the group, where you can edit things without disturbing the neat clipping etc.
Once you're finished, you can export the final image as a PDF or suchlike.

Syndicated 2012-04-22 10:11:21 (Updated 2012-04-22 10:25:33) from Dan Stowell

18 older entries...