Recent blog entries

1 Aug 2014 rlougher   » (Master)

JamVM 2.0.0 Released

I'm pleased to announce a new release of JamVM.  JamVM 2.0.0 is the first release of JamVM with support for OpenJDK (in addition to GNU Classpath). Although IcedTea already includes JamVM with OpenJDK support, this has been based on periodic snapshots of the development tree.

JamVM 2.0.0 supports OpenJDK 6, 7 and 8 (the latest). With OpenJDK 7 and 8 this includes full support for JSR 292 (invokedynamic). JamVM 2.0.0 with OpenJDK 8 also includes full support for Lambda expressions (JSR 335), type annotations (JSR 308) and method parameter reflection.

In addition to OpenJDK support, JamVM 2.0.0 also includes many bug-fixes, performance improvements and improved compatibility (from running the OpenJDK jtreg tests).

The full release notes can be found here (changes are categorised into those affecting OpenJDK, GNU Classpath and both), and the release package can be downloaded from the file area.

Syndicated 2014-08-01 00:46:00 (Updated 2014-08-01 00:46:05) from Robert Lougher

1 Aug 2014 apenwarr   » (Master)

Wifi: "beamforming" only begins to describe it

[Note to the impatient: to try out my beamforming simulation, which produced the above image, visit my beamlab test page - ideally in a browser with very fast javascript, like Chrome. You can also view the source.]

I promised you some cheating of Shannon's Law. Of course, as with most things in physics, even the cheating isn't really cheating; you just adjust your model until the cheating falls within the rules.

The types of "cheating" that occur in wifi can be briefly categorized as antenna directionality, beamforming, and MIMO. (People usually think about MIMO before beamforming, but I find the latter to be easier to understand, mathematically, so I switched the order.)

Antenna Directionality and the Signal to Noise Ratio

Previously, we discussed the signal-to-noise ratio (SNR) in some detail, and how Shannon's Law can tell you how fast you can transfer data through a channel with a given SNR.

The thing to notice about SNR is that you can increase it by increasing amplification at the sender (where the background noise is fixed but you have a clear copy of the signal) but not at the receiver. Once a receiver has a copy of the signal, it already has noise in it, so when you amplify the received signal, you just amplify the noise by the same amount, and the SNR stays constant.

(By the way, that's why those "amplified" routers you can buy don't do much good. They amplify the transmitted signal, but amplifying the received signal doesn't help in the other direction. The maximum range is still limited by the transmit power on your puny unamplified phone or laptop.)

On the other hand, one thing that *does* help is making your antenna more "directional." The technical term for this is "antenna gain," but I don't like that name, because it makes it sound like your antenna amplifies the signal somehow for free. That's not the case. Antenna gain doesn't so much amplify the signal as ignore some of the noise. Which has the same net effect on the SNR, but the mechanics of it are kind of important.

You can think of an antenna as a "scoop" that picks up both the signal and the noise from a defined volume of space surrounding the antenna. The shape of that volume is important. An ideal "isotropic" antenna (my favourite kind, although unfortunately it doesn't exist) picks up the signal equally in all directions, which means the region it "scoops" is spherical.

In general we assume that background noise is distributed evenly through space, which is not exactly true, but is close enough for most purposes. Thus, the bigger the volume of your scoop, the more noise you scoop up along with it. To stretch our non-mathematical metaphor well beyond its breaking point, a "bigger sphere" will contain more signal as well as more noise, so just expanding the size of your region doesn't affect the SNR. That's why, and I'm very sorry about this, a bigger antenna actually doesn't improve your reception at all.

(There's another concept called "antenna efficiency" which basically says you can adjust your scoop to resonate at a particular frequency, rejecting noise outside that frequency. That definitely works - but all antennas are already designed for this. That's why you get different antennas for different frequency ranges. Nowadays, the only thing you can do by changing your antenna size is to screw up the efficiency. You won't be improving it any further. So let's ignore antenna efficiency. You need a good quality antenna, but there is not really such a thing as a "better" quality antenna these days, at least for wifi.)

So ok, a bigger scoop doesn't help. But what can help is changing the shape of the scoop. Imagine if, instead of a sphere, we scoop up only the signal from a half-sphere.

If that half-sphere is in the direction of the transmitter - which is important! - then you'll still receive all the same signal you did before. But, intuitively, you'll only get half the noise, because you're ignoring the noise coming in from the other direction. On the other hand, if the half-sphere is pointed away from the incoming signal, you won't hear any of the signal at all, and you're out of luck. Such a half-sphere would have 2x the signal to noise ratio of a sphere, and in decibels, 2x is about 3dB. So this kind of (also not really existing) antenna is called 3dBi, where dBi is "decibels better than isotropic" so an isotropic (spherical) receiver is defined as 0dBi.

Taking it a step further, you could take a quarter of a sphere, ie. a region 90 degrees wide in any direction, and point it at your transmitter. That would double the SNR again, thus making a 6dBi antenna.

Real antennas don't pick up signals in a perfectly spherical shape; math makes that hard. So real ones tend to produce a kind of weirdly-shaped scoop with little roundy bits sticking out all over, and one roundy bit larger than the others, in the direction of interest. Essentially, the size of the biggest roundy bit defines the antenna gain (dBi). For regulatory purposes, the FCC mostly assumes you will use a 6dBi antenna, although of course the 6dBi will not be in the shape of a perfect quarter sphere.

Now is that the most hand-wavy explanation of antenna gain you have ever seen? Good.

Anyway, the lesson from all this is if you use a directional antenna, you can get improved SNR. A typical improvement is around 6dBi, which is pretty good. But the down side of a directional antenna is you have to aim it. With a wifi router, that can be bad news. It's great for outdoors if you're going to set up a long-distance link and aim very carefully; you can get really long range with a very highly directional antenna. But indoors, where distances are short and people move around a lot, it can be trouble.

One simple thing that does work indoors is hanging your wifi router from the ceiling. Then, if you picture eg. a quarter-sphere pointing downwards, you can imagine covering the whole room without really sacrificing anything (other than upstairs coverage, which you don't care about if you put a router up there too). Basically, that's as good as you can do, which is why most "enterprise" wifi deployments hang their routers from the ceiling. If you did that at home too - and had the right kind of antennas with the right gain in the right direction - you could get up to 6dB of improvement on your wifi signal, which is pretty great.

(Another trick some routers do is to have multiple antennas, each one pointing in a different direction, and then switch them on and off to pick the one(s) with the highest SNR for each client. This works okay but it interferes with MIMO - where you want to actively use as many antennas as possible - so it's less common nowadays. It was a big deal in the days of 802.11g, where that was the main reason to have multiple antennas at all. Let's talk about MIMO later, since MIMO is its own brand of fun.)

Beamforming

So okay, that was antenna directionality (gain). To summarize all that blathering: you point your antenna in a particular direction, and you get a better SNR in that particular direction.

But the problem is that wifi clients move around, so antennas permanently pointed in a particular direction are going to make things worse about as often as they help (except for simple cases like hanging from the ceiling, and even that leaves out the people upstairs).

But wouldn't it be cool if, using software plus magic, you could use multiple antennas to create "virtual directionality" and re-aim a "beam" automatically as the client device moves around?

Yes, that would be cool.

Unfortunately, that's not what beamforming actually is.

Calling it "beamforming" is not a *terrible* analogy, but the reality of the signal shape is vastly more complex and calling it a "beam" is pretty misleading.

This is where we finally talk about what I mentioned last time, where two destructively-interfering signals result in zero signal. Where does the power go?

As an overview, let's say you have two unrelated transmitters sending out a signal at the same frequency from different locations. It takes a fixed amount of power for each transmitter to send its signal. At some points in space, the signals interfere destructively, so there's no signal at that point. Where does it go? There's a conservation of energy problem after all; the transmitted power has to equal the power delivered, via electromagnetic waves, out there in the wild. Does it mean the transmitter is suddenly unable to deliver that much power in the first place? Is it like friction, where the energy gets converted into heat?

Well, it's not heat, because heat is vibration, ie, the motion of physical particles with mass. The electromagnetic waves we're talking about don't necessarily have any relationship with mass; they might be traveling through a vacuum where there is no mass, but destructive interference can still happen.

Okay, maybe the energy is re-emitted as radiation? Well, no. The waves in the first place were radiation. If we re-emitted them as radiation, then by definition, they weren't cancelled out. But we know they were cancelled out; you can measure it and see.

The short and not-very-satisfying answer is that in terms of conservation of energy, things work out okay. There are always areas where the waves interfere constructively that exactly cancel out the areas where they interfere destructively.

The reason I find that answer unsatisfying is that the different regions don't really interact. It's not like energy is being pushed, somehow, between the destructive areas and the constructive areas. It adds up in the end, because it has to, but that doesn't explain *how* it happens.

The best explanation I've found relates to quantum mechanics, in a lecture I read by Richard Feynman at some point. The idea is that light (and all electromagnetic waves, which is what we're talking about) actually does not really travel in straight lines. The idea that light travels in a straight line is just an illusion caused by large-scale constructive and destructive interference. Basically, you can think of light as travelling along all the possible paths - even silly paths that involve backtracking and spirals - from point A to point B. The thing is, however, that for almost every path, there is an equal and opposite path that cancels it out. The only exception is the shortest path - a straight line - of which there is only one. Since there's only one, there can't be an equal but opposite version. So as far as we're concerned, light travels in a straight line.

(I offer my apologies to every physicist everywhere for the poor quality of that explanation.)

But there are a few weird experiments you can do (look up the "double slit experiment" for example) to prove that in fact, the "straight line" model is the wrong one, and the "it takes all the possible paths" model is actually more like what's really going on.

So that's what happens here too. When we create patterns of constructive and destructive radio interference, we are simply disrupting the rule of thumb that light travels in a straight line.

Oh, is that all? Okay. Let's call it... beam-un-forming.

There's one last detail we have to note in order to make it all work out. The FCC says that if we transmit from two antennas, we have to cut the power from each antenna in half, so the total output is unchanged. If we do that, naively it might seem like the constructive interference effect is useless. When the waves destructively interfere, you still get zero, but when they constructively interfere, you get 2*½*cos(ωt), which is just the original signal. Might as well just use one antenna with the original transmit power, right?

Not exactly. Until now, I have skipped over talking about signal power vs amplitude, since it hasn't been that important so far. The FCC regulates *power*, not amplitude. The power of A*cos(ωt) turns out to be ½A2. I won't go over all the math, but the energy of f(x) during a given period is defined as ∫ f2(x) dx over that period. Power is energy divided by time. It turns out (via trig identities again) the power of cos(x) is 0.5, and the rest flows from there.

Anyway, the FCC limit requires a *power* reduction of ½. So if the original wave was cos(ωt), then the original power was 0.5. We need the new transmit power (for each antenna) to be 0.25, which is ½A2 = ½(0.5). Thus A = sqrt(0.5) = 1/sqrt(2) = 0.7 or so.

So the new transmit wave is 0.7 cos(ωt). Two of those, interfering constructively, gives about 1.4 cos(ωt). The resulting power is thus around ½(1.4)2 = 1, or double the original (non-reduced, with only one antenna) transmit power.

Ta da! Some areas have twice the power - a 3dB "antenna array gain" or "tx beamforming gain" - while others have zero power. It all adds up. No additional transmit power is required, but a receiver, if it's in one of the areas of constructive interference, now sees 3dB more signal power and thus 3dB more SNR.

We're left with the simple (ha ha) matter of making sure that the receiver is in an area of maximum constructive interference at all times. To make a long story short, we do this by adjusting the phase between the otherwise-identical signals coming from the different antennas.

I don't really know exactly how wifi arranges for the phase adjustment to happen; it's complicated. But we can imagine a very simple version: just send from each antenna, one at a time, and have the receiver tell you the phase difference right now between each variant. Then, on the transmitter, adjust the transmit phase on each antenna by an opposite amount. I'm sure what actually happens is more complicated than that, but that's the underlying concept, and it's called "explicit beamforming feedback." Apparently the 802.11ac standard made progress toward getting everyone to agree on a good way of providing beamforming feedback, which is important for making this work well.

Even more weirdly, the same idea works in reverse. If you know the phase difference between the client's antenna (we're assuming for now that he has only one, so we don't go insane) and each of your router's antennas, then when the client sends a signal *back* to you, you can extract the signal from the different antennas in a particular way that gets you the same amount of gain as in the transmit direction, and we call that rx beamforming. At least, I think you can. I haven't done the math for that yet, so I don't know for sure how well it can work.

Relatedly, even if there is no *explicit* beamforming feedback, in theory you can calculate the phase differences by listening to the signals from the remote end on each of your router's antennas. Because the signals should be following exactly the same path in both directions, you can guess what phase difference your signal arrived with by seeing which difference *his* signal came back with, and compensate accordingly. This is called "implicit beamforming feedback." Of course, if both ends try this trick at once, hilarity ensues.

And finally, I just want to point out how little the result of "beamforming" is like a beam. Although conceptually we'd like to think of it that way - we have a couple of transmitters tuning their signal to point directly at the receiver - mathematically it's not really like that. "Beamforming" creates a kind of on-off "warped checkerboard" sort of pattern that extends in all directions. To the extent that your antenna array is symmetrical, the checkerboard pattern is also symmetrical.

Beamforming Simulation

Of course, a checkerboard is also a flawed analogy. Once you start looking for a checkerboard, you start to see that in fact, the warping is kind of directional, and sort of looks like a beam, and you can imagine that with a hundred antennas, maybe it really would be "beam" shaped.

After doing all the math, I really wanted to know what beamforming looked like, so I wrote a little simulation of it, and the image at the top of this article is the result. (That particular one came from a 9-antenna beamforming array.)

You can also try out the simulation yourself, moving around up to 9 antennas to create different interference patterns. I find it kind of fun and mesmerizing, especially to think that these signals are all around us and if you could see them, they'd look like *that*. On my computer with Chrome, I get about 20 frames per second; with Safari, I get about 0.5 frames per second, which is not as fun. So use a browser with a good javascript engine.

Note that while the image looks like it has contours and "shadows," the shadows are entirely the effect of the constructive/destructive interference patterns causing bright and dark areas. Nevertheless, you can kind of visually see how the algorithm builds one level of constructive interference on top of another, with the peak of the humpiest bump being at the exact location of the receiver. It really works!

Some notes about the simulation:

  • It's 2-dimensional. Real life has at least 3 dimensions. It works pretty much the same though.
  • The intensity (brightness) of the colour indicates the signal strength at that point. Black means almost no signal.
  • "Blue" means cos(ωt) is positive at that point, and red means it's negative.
  • Because of the way phasors work, "blue plus red" is not the only kind of destructive interference, so it's a bit confusing.
  • Click on the visualization to move around the currently-selected transmitter or receiver.
  • When you move the receiver around, it auto-runs the beamforming optimization so you can see the "beam" move around.
  • The anti-optimize button is not very smart; a smarter algorithm could achieve an even less optimal result. But most of the time it does an okay job, and it does show how you can also use beamforming to make a receiver *not* hear your signal. That's the basis of MU-MIMO.
MIMO

The last and perhaps most exciting way to cheat Shannon's Law is MIMO. I'll try to explain that later, but I'm still working out the math :)

Syndicated 2014-07-29 06:41:59 from apenwarr

31 Jul 2014 danstowell   » (Journeyer)

Background reading on Israel and Palestine

I'm going to try and avoid ranting about Israel and Palestine because there's much more heat than light right now. But I want to recommend some background reading that seems useful, and it's historical/background stuff rather than partisan:

I also want to point to a more "one-sided" piece (in the sense that it criticises one "side" specifically - I've no idea about the author's actual motivations): Five Israeli Talking Points on Gaza - Debunked. I recommend it because it raises some interesting points about international law and the like, and we in the UK don't seem to hear these issues filled out on the radio.

As usual, please don't assume anyone is purely pro-Palestine or pro-Israel, and don't confuse criticism of Israel/Hamas with criticism of Judaism/Islam. The topic is hard to talk about (especially on the internet) without the conversation spiralling into extremes.

Syndicated 2014-07-31 18:09:27 from Dan Stowell

31 Jul 2014 crhodes   » (Master)

london employment visualization part 2

Previously, I did all the hard work to obtain and transform some data related to London, including borough and MSOA shapes, population counts, and employment figures, and used them to generate some subjectively pretty pictures. I promised a followup on the gridSVG approach to generating visualizations with more potential for interactivity than a simple picture; this is the beginning of that.

Having done all the heavy lifting in the last post, including being able to generate ggplot objects (whose printing results in the pictures), it is relatively simple to wrap output to SVG instead of output to PNG around it all. In fact it is extremely simple to output to SVG; simply use an SVG output device

  svg("/tmp/london.svg", width=16, height=10)

rather than a PNG one

  png("/tmp/london.png", width=1536, height=960)

(which brings back for me memories of McCLIM, and my implementation of an SVG backend, about a decade ago). So what does that look like? Well, if you’ve entered those forms at the R repl, close the png device

  dev.off()

and then (the currently active device being the SVG one)

  print(ggplot.london(fulltime/(allages-younger-older)))
dev.off()

default (cairo) SVG device

That produces an SVG file, and if SVG in and of itself is the goal, that’s great. But I would expect that the main reason for producing SVG isn’t so much for the format itself (though it is nice that it is a vector image format rather than rasterized, so that zooming in principle doesn’t cause artifacts) but for the ability to add scripting to it: and since the output SVG doesn’t retain any information about the underlying data that was used to generate it, it is very difficult to do anything meaningful with it.

I write “very difficult” rather than “impossible”, because in fact the SVGAnnotation package aimed to do just that: specifically, read the SVG output produced by the R SVG output device, and (with a bit of user assistance and a liberal sprinkling of heuristics) attempt to identify the regions of the plot corresponding to particular slices of datasets. Then, using a standard XML library, the user could decorate the SVG with extra information, add links or scripts, and essentially do whatever they needed to do; this was all wrapped up in an svgPlot function. The problem with this approach is that it is fragile: for example, one heuristic used to identify a lattice plot area was that there should be no text in it, which fails for custom panel functions with labelled guidlines. It is possible to override the default heuristic, but it’s difficult to build a robust system this way (and in fact when I tried to run some two-year old analysis routines recently, the custom SVG annotation that I wrote broke into multiple pieces given new data).

gridSVG’s approach is a little bit different. Instead of writing SVG out and reading it back in, it relies on the grid graphics engine (so does not work with so-called base graphics, the default graphics system in R), and on manipulating the grid object which represents the current scene. The gridsvg pseudo-graphics-device does the behind-the-scenes rendering for us, with some cost related to yet more wacky interactions with R’s argument evaluation semantics which we will pay later.

  gridsvg("/tmp/gridsvg-london.svg", width=16, height=10)
print(ggplot.london(fulltime/(allages-younger-older)))
dev.off()

Because ggplot uses grid graphics, this just works, and generates a much more structured svg file, which should render identically to the previous one:

SVG from gridSVG device

If it renders identically, why bother? Well, because now we have something that writes out the current grid scene, we can alter that scene before writing out the document (at dev.off() time). For example, we might want to add tooltips to the MSOAs so that their name and the quantity value can be read off by a human. Wrapping it all up into a function, we get

  gridsvg.london <- function(expr, subsetexpr=TRUE, filename="/tmp/london.svg") {

We need to compute the subset in this function, even though we’re going to be using the full dataset in ggplot.london when we call it, in order to get the values and zone labels.

      london.data <- droplevels(do.call(subset, list(london$msoa.fortified, substitute(subsetexpr))))

Then we need to map (pun mostly intended) the values in the fortified data frame to the polygons drawn; without delving into the format, my intuition is that the fortified data frame contains vertex information, whereas the grid (and hence SVG) data is organized by polygons, and there may be more than one polygon for a region (for example if there are islands in the Thames). Here we simply generate an index from a group identifier to the first row in the dataframe in that group, and use it to pull out the appropriate value and label.

      is <- match(levels(london.data$group), london.data$group)
    vals <- eval(substitute(expr), london.data)[is]
    labels <- levels(london.data$zonelabel)[london.data$zonelabel[is]]

Then we pay the cost of the argument evaluation semantics. My first try at this line was gridsvg(filename, width=16, height=10), which I would have (perhaps naïvely) expected to work, but which in fact gave me an odd error suggesting that the environment filename was being evaluated in was the wrong one. Calling gridsvg like this forces evaluation of filename before the call, so there should be less that can go wrong.

      do.call(gridsvg, list(filename, width=16, height=10))

And, as before, we have to do substitutions rather than evaluations to get the argument expressions evaluated in the right place:

      print(do.call(ggplot.london, list(substitute(expr), substitute(subsetexpr))))

Now comes the payoff. At this point, we have a grid scene, which we can investigate using grid.ls(). Doing so suggests that the map data is in a grid object named like GRID.polygon followed by an integer, presumably in an attempt to make names unique. We can “garnish” that object with attributes that we want: some javascript callbacks, and the values and labels that we previously calculated.

      grid.garnish("GRID.polygon.*",
                 onmouseover=rep("showTooltip(evt)", length(is)),
                 onmouseout=rep("hideTooltip()", length(is)),
                 zonelabel=labels, value=vals,
                 group=FALSE, grep=TRUE)

We need also to provide implementations of those callbacks. It is possible to do that inline, but for simplicity here we simply link to an external resource.

      grid.script(filename="tooltip.js")

Then close the gridsvg device, and we’re done!

      dev.off()
}

Then gridsvg.london(fulltime/(allages-younger-older)) produces:

proportion employed full-time

which is some kind of improvement over a static image for data of this complexity.

And yet... the perfectionist in me is not quite satisfied. At issue is a minor graphical glitch, but it’s enough to make me not quite content; the border of each MSOA is stroked in a slightly lighter colour than the fill colour, but that stroke extends beyond the border of the MSOA region (the stroke’s centre is along the polygon edge). This means that the strokes from adjacent MSOAs overlie each other, so that the most recently drawn obliterates any drawn previously. This also causes some odd artifacts around the edges of London (and into the Thames, and pretty much obscures the river Lea).

This can be fixed by clipping; I think the trick to clip a path to itself counts as well-known. But clipping in SVG is slightly hard, and the gridSVG facilities for doing it work on a grob-by-grob basis, while the map is all one big polygon grid object. So to get the output I want, I am going to have to perform surgery on the SVG document itself after all; we are still in a better position than before, because we will start with a sensible hierarchical arrangement of graphical objects in the SVG XML structure, and gridSVG furthermore provides some introspective capabilities to give XML ids or XPath query strings for particular grobs.

grid.export exports the current grid scene to SVG, returning a list with the SVG XML itself along with this mapping information. We have in the SVG output an arbitrary number of polygon objects; our task is to arrange such that each of those polygons has a clip mask which is itself. In order to do that, we need for each polygon a clipPath entry with a unique id in a defs section somewhere, where each clipPath contains a use pointing to the original polygon’s ID; then each polygon needs to have a clip-path style property pointing to the corresponding clipPath object. Clear?

  addClipPaths <- function(gridsvg, id) {

given the return value of grid.export and the identifier of the map grob, we want to get the set of XML nodes corresponding to the polygons within that grob.

      ns <- getNodeSet(gridsvg$svg, sprintf("%s/*", gridsvg$mappings$grobs[[id]]$xpath))

Then for each of those nodes, we want to set a clip path.

      for (i in 1:length(ns)) {
        addAttributes(ns[[i]], style=sprintf("clip-path: url(#clipPath%s)", i))
    }

For each of those nodes, we also need to define a clip path

      clippaths <- list()
    for (i in 1:length(ns)) {
        clippaths[[i]] <- newXMLNode("clipPath", attrs=c(id=sprintf("clipPath%s", i)))
        use <- newXMLNode("use", attrs = c("xlink:href"=sprintf("#%s", xmlAttrs(ns[[i]])[["id"]])))
        addChildren(clippaths[[i]], kids=list(use))
    }

And hook it into the existing XML

      defs <- newXMLNode("defs")
    addChildren(defs, kids=clippaths)
    top <- getNodeSet(gridsvg$svg, "//*[@id='gridSVG']")[[1]]
    addChildren(top, kids=list(defs))
}

Then our driver function needs some slight modifications:

  gridsvg.london2 <- function(expr, subsetexpr=TRUE, filename="/tmp/london.svg") {
    london.data <- droplevels(do.call(subset, list(london$msoa.fortified, substitute(subsetexpr))))
    is <- match(levels(london.data$group), london.data$group)
    vals <- eval(substitute(expr), london.data)[is]
    labels <- levels(london.data$zonelabel)[london.data$zonelabel[is]]

Until here, everything is the same, but we can’t use the gridsvg pseudo-graphics device any more, so we need to do graphics device handling ourselves:

      pdf(width=16, height=10)
    print(do.call(ggplot.london, list(substitute(expr), substitute(subsetexpr))))
    grid.garnish("GRID.polygon.*",
                 onmouseover=rep("showTooltip(evt)", length(is)),
                 onmouseout=rep("hideTooltip()", length(is)),
                 zonelabel=labels, value=vals,
                 group=FALSE, grep=TRUE)
    grid.script(filename="tooltip.js")

Now we export the scene to SVG,

      gridsvg <- grid.export()

find the grob containing all the map polygons,

      grobnames <- grid.ls(flatten=TRUE, print=FALSE)$name
    grobid <- grobnames[[grep("GRID.polygon", grobnames)[1]]]

add the clip paths,

      addClipPaths(gridsvg, grobid)
    saveXML(gridsvg$svg, file=filename)

and we’re done!

      dev.off()
}

Then gridsvg.london2(fulltime/(allages-younger-older)) produces:

proportion employed full-time (with polygon clipping)

and I leave whether the graphical output is worth the effort to the beholder’s judgment.

As before, these images contain National Statistics and Ordnance Survey data © Crown copyright and database right 2012.

Syndicated 2014-07-31 17:07:34 (Updated 2014-07-31 17:14:34) from notes

31 Jul 2014 etbe   » (Master)

Links July 2014

Dave Johnson wrote an interesting article for Salon about companies ripping off the tax system by claiming that all their income is produced in low tax countries [1].

Seb Lee-Delisle wrote an insightful article about how to ask to get paid to speak [2]. I should do that.

Daniel Pocock wrote an informative article about the reConServer simple SIP conferencing server [3]. I should try it out, currently most people I want to conference with are using Google Hangouts, but getting away from Google is a good thing.

François Marier wrote an informative post about hardening ssh servers [4].

S. E. Smith wrote an interesting article “I Am Tired of Hearing Programmers Defend Gender Essentialism [5].

Bert Archer wrote an insightful article about lazy tourism [6]. His initial example of “love locks” breaking bridges was a bit silly (it’s not difficult to cut locks off a bridge) but his general point about lazy/stupid tourism is good.

Daniel Pocock wrote an insightful post about new developments in taxis, the London Taxi protest against Uber, and related changes [7]. His post convinced me that Uber is a good thing and should be supported. I checked the prices and unfortunately Uber is more expensive than normal taxis for my most common journey.

Cory Doctorow wrote an insightful article for The Guardian about the moral issues related to government spying [8].

The Verge has an interesting review of the latest Lytro Lightbox camera [9]. Not nearly ready for me to use, but interesting technology.

Prospect has an informative article by Kathryn Joyce about the Protestant child sex abuse scandal in the US [10]. Billy Graham’s grandson is leading the work to reform churches so that they protect children instead of pedophiles. Prospect also has an article by Kathryn Joyce about Christians home-schooling kids to try and program them to be zealots and how that hurts kids [11].

The Daily Beast has an interesting article about the way that the extreme right wing in the US are trying to kill people, it’s the right wing death panel [12].

Jay Michaelson wrote an informative article for The Daily Beast about right-wing hate groups in the US who promote the extreme homophobic legislation in Russia and other countries [13]. It also connects to the Koch brothers who seem to be associated with most evil. Elias Isquith wrote an insightful article for Salon about the current right-wing obsession with making homophobic discrimination an issue of “religious liberty” will hurt religious people [14]. He also describes how stupid the right-wing extremists are in relation to other issues too.

EconomixComix.com has a really great comic explaning the economics of Social Security in the US [15]. They also have a comic explaining the TPP which is really good [16]. They sell a comic book about economics which I’m sure is worth buying. We need to have comics explaining all technical topics, it’s a good way of conveying concepts. When I was in primary school my parents gave me comic books covering nuclear physics and other science topics which were really good.

Mia McKenzie wrote an insightful article for BlackGirlDangerous.com about dealing with racist white teachers [17]. I think that it would be ideal to have a school dedicated to each minority group with teachers from that group.

Related posts:

  1. Links July 2013 Wayne Mcgregor gave an interesting TED talk about the creative...
  2. Links May 2014 Charmian Gooch gave an interesting TED talk about her efforts...
  3. Links June 2014 Russ Albery wrote an insightful blog post about trust, computer...

Syndicated 2014-07-31 13:38:53 from etbe - Russell Coker

31 Jul 2014 Stevey   » (Master)

luonnos viesti - 31 heinäkuu 2014

Yesterday I spent a while looking at the Debian code search site, an enormously useful service allowing you to search the code contained in the Debian archives.

The end result was three trivial bug reports:

#756565 - lives

Insecure usage of temporary files.

A CVE-identifier should be requested.

#756566 - libxml-dt-perl

Insecure usage of temporary files.

A CVE-identifier has been requested by Salvatore Bonaccorso, and will be added to my security log once allocated.

756600 - xcfa

Insecure usage of temporary files.

A CVE-identifier should be requested.

Finding these bugs was a simple matter of using the code-search to look for patterns like "system.*>.*%2Ftmp".

Perhaps tomorrow somebody else would like to have a go at looking for backtick-related operations ("`"), or the usage of popen.

Tomorrow I will personally be swimming in a loch, which is more fun than wading in code..

Syndicated 2014-07-31 12:54:16 from Steve Kemp's Blog

31 Jul 2014 lucasr   » (Master)

The new TwoWayView

What if writing custom view recycling layouts was a lot simpler? This question stuck in my mind since I started writing Android apps a few years ago.

The lack of proper extension hooks in the AbsListView API has been one of my biggest pain points on Android. The community has come up with different layout implementations that were largely based on AbsListView‘s code but none of them really solved the framework problem.

So a few months ago, I finally set to work on a new API for TwoWayView that would provide a framework for custom view recycling layouts. I had made some good progress but then Google announced RecyclerView at I/O and everything changed.

At first sight, RecyclerView seemed to be an exact overlap with the new TwoWayView API. After some digging though, it became clear that RecyclerView was a superset of what I was working on. So I decided to embrace RecyclerView and rebuild TwoWayView on top of it.

The new TwoWayView is functional enough now. Time to get some early feedback. This post covers the upcoming API and the general-purpose layout managers that will ship with it.

Creating your own layouts

RecyclerView itself doesn’t actually do much. It implements the fundamental state handling around child views, touch events and adapter changes, then delegates the actual behaviour to separate components—LayoutManager, ItemDecoration, ItemAnimator, etc. This means that you still have to write some non-trivial code to create your own layouts.

LayoutManager is a low-level API. It simply gives you extension points to handle scrolling and layout. For most layouts, the general structure of a LayoutManager implementation is going to be very similar—recycle views out of parent bounds, add new views as the user scrolls, layout scrap list items, etc.

Wouldn’t it be nice if you could implement LayoutManagers with a higher-level API that was more focused on the layout itself? Enter the new TwoWayView API.

TWAbsLayoutManagercode is a simple API on top of LayoutManager that does all the laborious work for you so that you can focus on how the child views are measured, placed, and detached from the RecyclerView.

To get a better idea of what the API looks like, have a look at these sample layouts: SimpleListLayout is a list layout and GridAndListLayout is a more complex example where the first N items are laid out as a grid and the remaining ones behave like a list. As you can see you only need to override a couple of simple methods to create your own layouts.

Built-in layouts

The new API is pretty nice but I also wanted to create a space for collaboration around general-purpose layout managers. So far, Google has only provided LinearLayoutManager. They might end up releasing a few more layouts later this year but, for now, that is all we got.

layouts

The new TwoWayView ships with a collection of four built-in layouts: List, Grid, Staggered Grid, and Spannable Grid.

These layouts support all RecyclerView features: item animations, decorations, scroll to position, smooth scroll to position, view state saving, etc. They can all be scrolled vertically and horizontally—this is the TwoWayView project after all ;-)

You probably know how the List and Grid layouts work. Staggered Grid arranges items with variable heights or widths into different columns or rows according to its orientation.

Spannable Grid is a grid layout with fixed-size cells that allows items to span multiple columns and rows. You can define the column and row spans as attributes in the child views as shown below.

<FrameLayout
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    app:colSpan="2"
    app:rowSpan="3">
    ...

Utilities

The new TwoWayView API will ship with a convenience view (TWView) that can take a layoutManager XML attribute that points to a layout manager class.

<org.lucasr.twowayview.TWView
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    app:layoutManager="TWListLayoutManager"/>

This way you can leverage the resource system to set layout manager depending on device features and configuration via styles.

You can also use TWItemClickListener to use ListView-style item (long) click listeners. You can easily plug-in support for those in any RecyclerView (see sample).

I’m also planning to create pluggable item decorations for dividers, item spacing, list selectors, and more.


That’s all for now! The API is still in flux and will probably go through a few more iterations. The built-in layouts definitely need more testing.

You can help by filing (and fixing) bugs and giving feedback on the API. Maybe try using the built-in layouts in your apps and see what happens?

I hope TwoWayView becomes a productive collaboration space for RecyclerView extensions and layouts. Contributions are very welcome!

Syndicated 2014-07-31 11:33:17 from Lucas Rocha

31 Jul 2014 etbe   » (Master)

BTRFS Status July 2014

My last BTRFS status report was in April [1], it wasn’t the most positive report with data corruption and system hangs. Hacker News has a brief discussion of BTRFS which includes the statement “Russell Coker’s reports of his experiences with BTRFS give me the screaming heebie-jeebies, no matter how up-beat and positive he stays about it” [2] (that’s one of my favorite comments about my blog).

Since April things have worked better. Linux kernel 3.14 solves the worst problems I had with 3.13 and it’s generally doing everything I want it to do. I now have cron jobs making snapshots as often as I wish (as frequently as every 15 minutes on some systems), automatically removing snapshots (removing 500+ snapshots at once doesn’t hang the system), balancing, and scrubbing. The fact that I can now expect that a filesystem balance (which is a type of defragment operation for BTRFS that frees some “chunks”) from a cron job and expect the system not to hang means that I haven’t run out of metadata chunk space. I expect that running out of metadata space can still cause filesystem deadlocks given a lack of reports on the BTRFS mailing list of fixes in that regard, but as long as balance works well we can work around that.

My main workstation now has 35 days of uptime and my home server has 90 days of uptime. Also the server that stores my email now has 93 days uptime even though it’s running Linux kernel 3.13.10. I am rather nervous about the server running 3.13.10 because in my experience every kernel before 3.14.1 had BTRFS problems that would cause system hangs. I don’t want a server that’s an hour’s drive away to hang…

The server that runs my email is using kernel 3.13.10 because when I briefly tried a 3.14 kernel it didn’t work reliably with the Xen kernel 4.1 from Debian/Wheezy and I had a choice of using the Xen kernel 4.3 from Debian/Unstable to match the Linux kernel or use an earlier Linux kernel. I have a couple of Xen servers running Debian/Unstable for test purposes which are working well so I may upgrade my mail server to the latest Xen and Linux kernels from Unstable in the near future. But for the moment I’m just not doing many snapshots and never running a filesystem scrub on that server.

Scrubbing

In kernel 3.14 scrub is working reliably for me and I have cron jobs to scrub filesystems on every system running that kernel. So far I’ve never seen it report an error on a system that matters to me but I expect that it will happen eventually.

The paper “An Analysis of Data Corruption in the Storage Stack” from the University of Wisconsin (based on NetApp data) [3] shows that “nearline” disks (IE any disks I can afford) have an incidence of checksum errors (occasions when the disk returns bad data but claims it to be good) of about 0.42%. There are 18 disks running in systems I personally care about (as opposed to systems where I am paid to care) so with a 0.42% probability of a disk experiencing data corruption per year that would give a 7.3% probability of having such corruption on one disk in any year and a greater than 50% chance that it’s already happened over the last 10 years. Of the 18 disks in question 15 are currently running BTRFS. Of the 15 running BTRFS 10 are scrubbed regularly (the other 5 are systems that don’t run 24*7 and the system running kernel 3.13.10).

Newer Kernels

The discussion on the BTRFS mailing list about kernel 3.15 is mostly about hangs. This is correlated with some changes to improve performance so I presume that it has exposed race conditions. Based on those discussions I haven’t felt inclined to run a 3.15 kernel. As the developers already have some good bug reports I don’t think that I could provide any benefit by doing more testing at this time. I think that there would be no benefit to me personally or the Linux community in testing 3.15.

I don’t have a personal interest in RAID-5 or RAID-6. The only systems I run that have more data than will fit on a RAID-1 array of cheap SATA disks are ones that I am paid to run – and they are running ZFS. So the ongoing development of RAID-5 and RAID-6 code isn’t an incentive for me to run newer kernels. Eventually I’ll test out RAID-6 code, but at the moment I don’t think they need more bug reports in this area.

I don’t have a great personal interest in filesystem performance at this time. There are some serious BTRFS performance issues. One problem is that a filesystem balance and subtree removal seem to take excessive amounts of CPU time. Another is that there isn’t much support for balancing IO to multiple devices (in RAID-1 every process has all it’s read requests sent to one device). For large-scale use of a filesystem these are significant problems. But when you have basic requirements (such as a mail server for dozens of users or a personal workstation with a quad-core CPU and fast SSD storage) it doesn’t make much difference. Currently all of my systems which use BTRFS have storage hardware that exceeds the system performance requirements by such a large margin that nothing other than installing Debian packages can slow the system down. So while there are performance improvements in newer versions of the BTRFS kernel code that isn’t an incentive for me to upgrade.

It’s just been announced that Debian/Jessie will use Linux 3.16, so I guess I’ll have to test that a bit for the benefit of Debian users. I am concerned that 3.16 won’t be stable enough for typical users at the time that Jessie is released.

Related posts:

  1. BTRFS Status March 2014 I’m currently using BTRFS on most systems that I can...
  2. BTRFS Status April 2014 Since my blog post about BTRFS in March [1] not...
  3. Starting with BTRFS Based on my investigation of RAID reliability [1] I have...

Syndicated 2014-07-31 10:45:10 from etbe - Russell Coker

31 Jul 2014 bagder   » (Master)

Me in numbers, today

Number of followers on twitter: 1,302

Number of commits during the last 365 days at github: 686

Number of publicly visible open source commits counted by openhub: 36,769

Number of questions I’ve answered on stackoverflow: 403

Number of connections on LinkedIn: 608

Number of days I’ve committed something in the curl project: 2,869

Number of commits by me, merged into Mozilla Firefox: 9

Number of blog posts on daniel.haxx.se, including this: 734

Number of friends on Facebook: 150

Number of open source projects I’ve contributed to, openhub again: 35

Number of followers on Google+: 557

Number of tweets: 5,491

Number of mails sent to curl mailing lists: 21,989

TOTAL life achievement: 71,602

Syndicated 2014-07-31 09:39:54 from daniel.haxx.se

31 Jul 2014 benad   » (Apprentice)

The Case for Complexity

Like clockwork, there is a point in a programmer's career where one realizes that most programming tools suck, that not only they hinder the programmer's productivity, but worse may have an impact on the quality of the product for end users. And so, there are cries of the absurdity of it all, some posit that complex software development tools must exist because some programmers like complexity above productivity, while others long for the days where programming was easier.

I find these reactions amusing. Kind of a middle-life crisis for programmers. Trying to rationalize their careers, most just end up admitting defeat for a professional life of mediocrity, by using dumber tools and hoping to avoid the main reason why programming can be challenging. I went into that "programmer's existential crisis" in my third year as a programmer, just before deciding on making it a career, but I went out of it with what seems to be a conclusion seldom shared by my fellow programmers. To some extent this is why I don't really consider myself a programmer but rather a software designer.

The fundamental issue isn't the fact that software is (seemingly) unnecessarily complex, but rather trying to understand the source of that complexity. Too many programmers assume that programming is based on applied mathematics. Well, it ought to be, but programming as practiced in the industry is quite far from its computer science roots. That deviation isn't due only from programming mistakes, but due to the more irrational external constraints and requirements. Even existing bugs become part of the external constraints if they are in things you cannot fix but must "work around".

Those absurdities can come from two directions: Top-down, based on human need and mental models, or Bottom-up, based on faulty mathematical or software design models. Productive and efficient software development tools, by themselves, bring complexity above the programming language. Absurd business requirements, including cost-saving measures and dealing with buggy legacy systems not only bring complexity, but the workarounds they require bring even more absurd code.

Now, you may argue that abstractions make things simpler, and to some extent, they are. But abstractions only tend to mask complexity, and when things break or don't work as expected, that complexity re-surfaces. From the point of view of a typical user, if it's broken, you ask somebody else to fix it or replace it. But being a programmer is being that "somebody else" that takes responsibility into understanding, to some extent, that complexity.

You could argue that software should always be more usable first. And yet, usable software can be far more difficult to implement than software that is more "native" to its computing environment. All those manual pages, the flexible command-line parameters, those adaptive GUIs, pseudo-AIs, Clippy, and so on, bring enormous challenges to the implementation of any software because humans don't think like machines, and vice-versa. As long as users are involved, software cannot be fully "intuitive" for both users and computers at the same time. Computers are not "computing machines", but more sophisticated state machines made to run useful software for users. Gone are the days where room-sized computers just do "math stuff" for banks, where user interaction was limited to numbers and programmers. The moment there were personal computers, people didn't write "math-based software", but rather text-based games with code of dubious quality.

Complexity of software will always increase, because it can. Higher-level programming languages become more and more removed from the hardware execution model. Users keep asking for more features that don't necessarily "fit well", so either you add more buttons to that toolbar, or you create a brand new piece of software with its own interfaces. Even if by some reason computers stopped getting so much faster over time, it wouldn't stop users from asking for "more", and programmers from asking for "productivity".

My realization was that there has to be a balance between always increasing complexity and our ability to understand it. Sure, fifty years ago it would be reasonable to have a single person spend a few years to fully understand a complete computer system, but nowadays we just have to become specialized. Still, specialization is possible because we can understand a higher-level conceptual design of the other components rather than just an inconsistent mash up of absurdity. Design is the solution. Yes, things in software will always get bigger, but we can make it more reasonable to attempt to understand it all if, from afar, it was designed soundly rather than just accidentally "became". With design, complexity becomes a bit smaller and manageable, and even though only the programmers will have to deal with most of that complexity, good design produce qualities that become visible up to the end users. Good design makes tighter "vertical integration" easier since making sense of the whole system is easier.

Ultimately, making a better software product for the end users requires the programmer to take responsibility for the complexity of not only the software's code, but also of its environment. That means using sound design for any new code introduced, and accepting the potential absurdity of the rest. If you can't do that, then you'll never be more than a "code monkey".

Notes

  1. Many programmers tend to assume that their code is logically sound, and that their errors are mostly due to menial mistakes. In my experience, it's the other way around: The buggiest code is produced when code isn't logically sound, and this is what happens most of the time, especially in scripting languages that have weak or implicit typing.
  2. I use the term "complexity" more as the number of module connections than the average of module coupling. I find "complexity as a sum" more intuitive from the point of view of somebody that has to be aware of the complete system: Adding an abstraction layer still adds a new integration point between the old and new code, adding more things that could break. This is why I normally consider programming tools added complexity, even though their code completion and generation can make the programmers more productive.

Syndicated 2014-07-31 02:14:53 from Benad's Blog

30 Jul 2014 oubiwann   » (Journeyer)

OSCON 2014 Theme Song - Andrew Sorensen's Live Coding Keynote

Andrew Sorensen live-coding at OSCON 2014
Keynote

Shortly after Andrew Sorensen began the performance segment of his keynote at OSCON 2014, the #oscon Twitter topic began erupting with posts about the live coding session. Comments, retweets, and additional links persisted for that day and the next. In short, Andrew was a hit :-)

My first encounter with Andrew's work was a few years ago when I was getting back into Lisp. I was playing with generative music with Overtone (and then, a bit later, experimenting with SuperCollider, Hy, and Twisted) and came across his piece A Study in Keith. You might want to take a break from reading this port and watch that now ...

When Andrew started up his presentation, I didn't immediately recognize him. In fact, when the code was displayed on the big screens, I assumed it was Clojure until I looked closely and saw he was using (define ...) and not (defun ...).  This seemed very familiar, and then I remembered Impromptu, which ultimately lead to my discovery of Extempore (see more links below) and the realization that this is what Andrew was using to live code.

At the end of the performance a bunch of us jumped up and gave a standing ovation. (In fact, you can hear me yell out "YEAH" at the end of his presentation when he says "And there we go."). It was quite a show. It seemed that OSCON 2014 had been given a theme song. The next step was getting the source code ...


Andrew's gist (Dark Github Theme)
Sharing the Code

Andrew gave a presentation on Extempore in the ballroom right after the keynote. This too was fantastic and resulted in much tweeting.

Afterwards a bunch of us went up front and chatted with him, enthusing about his work, the recent presentation, the keynote, and his previously published pieces.

I had Andrew's ear for a moment, and asked him if he was interested in sharing his keynote source -- there had been several requests for it on Twitter (that also got retweeted and/or favourited). Without hesitation, he gave an enthusiastic "yes" and we were off and running for the lounge where we could sit down to create a gist (and grab a cappuccino!). The availability of the source was announced immediately, to the delight of many.


Setting Up Extempore

Sublime Text 3 connected to Extempore
Later that night in my hotel room, I had time to download and run Extempore ... and discovered that I couldn't actually play the keynote code, since there was some implicit setup I was missing. However, after some digging around on the docs site and the mail list, music was pouring forth from my laptop -- to my great joy :-D

To ensure anyone else who is not familiar with Extempore can also have this pleasure, I've put together the all the prerequisites and setup necessary in a forked gist, in multiple parts. I will go through those in this blog post. Also: all of my testing and live coding was done using Ben Swift's Extempore Sublime Text plugin.

The first step is getting all the dependencies. You'll want to start the downloads right away, since they are large (the sample files are compressed .wavs). While that's going on, you can install Extempore using Homebrew (this worked for me on Mac OS X with no additional tweaking/configuration necessary):

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Advogato User Stats
Users14002
Observer9886
Apprentice746
Journeyer2340
Master1026

New Advogato Members

Recently modified projects

20 Jun 2014 Ultrastudio.org
13 Apr 2014 Babel
13 Apr 2014 Polipo
19 Mar 2014 usb4java
8 Mar 2014 Noosfero
17 Jan 2014 Haskell
17 Jan 2014 Erlang
17 Jan 2014 Hy
17 Jan 2014 clj-simulacrum
17 Jan 2014 Haskell-Lisp
17 Jan 2014 lfe-disco
17 Jan 2014 clj-openstack
17 Jan 2014 lfe-openstack
17 Jan 2014 LFE
10 Jan 2014 libstdc++

New projects

8 Mar 2014 Noosfero
17 Jan 2014 Haskell
17 Jan 2014 Erlang
17 Jan 2014 Hy
17 Jan 2014 clj-simulacrum
17 Jan 2014 Haskell-Lisp
17 Jan 2014 lfe-disco
17 Jan 2014 clj-openstack
17 Jan 2014 lfe-openstack
17 Jan 2014 LFE
1 Nov 2013 FAQ Linux
15 Apr 2013 Gramps
8 Apr 2013 pydiction
28 Mar 2013 Snapper
5 Jan 2013 Templer