10 Jan 2011 JoeNotCharles   » (Journeyer)

The Best Albums of 2010, part 5: The Stupid Methods

Part 1

Part 2

Part 3

Part 4

We have a set of ballots containing the year end Best Albums of 2010 lists from 9 publications. Yesterday I listed 12 possible ways to count these ballots to come up with a unified list.

Today I’ll go through the ways that don’t work for this data.

First Past the Post

I’ve said that First Past the Post is a terrible voting system. Now, let me demonstrate:

First we count up the top vote from every ballot. (Easy to do by hand, since there are only 9 ballots.) The winner is the #1 album of 2010. Then we drop that winner and count up the top votes that remain. The winner is the #2 album of 2010. Repeat until we’ve ranked all the albums.

That gives:

4 Kanye West – My Beautiful Dark Twisted Fantasy
2 Arcade Fire – The Suburbs
1 Caribou – Swim
1 John Grant – Queen of Denmark
1 These New Puritans – Hidden

So Kanye West has the #1 album of 2010, by this count. That doesn’t make any sense – 4 publications ranked him #1, but 3 others didn’t like his album enough to rank it at all! Compare Arcade Fire, whose album was liked enough to put in the top 25 by everyone: on the 4 lists where Kanye is #1, Arcade Fire is #2, #3, #4 and #11, but on the 5 other lists, Arcade Fire beats Kanye West hands down. So over half of our voters greatly prefer Arcade Fire to Kanye West, and most of the remaining voters only prefer Kanye by a tiny amount. Only Pitchfork (Kanye West at #1, Arcade Fire at #11) would be greatly dissatisfied by putting Arcade Fire ahead of Kanye West, while Mojo, Q, and NME would be extremely dissatisfied to put Kanye ahead of Arcade Fire. (And NPR, with Arcade Fire at #1 and Kanye West at #10, is a mirror of Pitchfork – call that greatly dissatisfied. And while Rough Trade wouldn’t be happy with either result, having ranked Arcade Fire way down at #21, they’d surely be even more pissed off if the win went to Kanye West, who they didn’t rank at all.)

More concisely: Arcade Fire is the Condorcet Winner. Kanye West should not win. So we can discard this result already.

Just for fun, let’s remove Kanye and see how the top 3 ends up. Votes for #2 are:

3 Arcade Fire – The Suburbs
1 Caribou – Swim
1 Deerhunter – Halcyon Digest
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 The Black Keys – Brothers
1 These New Puritans – Hidden

So Arcade Fire is #2. That’s not terrible. After removing Arcade Fire, votes for #3 are:

2 The Black Keys – Brothers
1 Beach House – Teen Drama
1 Caribou – Swim
1 Deerhunter – Halcyon Digest
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 Robert Plant – Band of Joy
1 These New Puritans – Hidden

#3 is The Black Keys, by 1 vote. But we were perilously close to a 9-way tie for 3rd. Which illustrates the weirdness of our data set: in most elections, there are a handful of candidates and hundreds of voters. We have hundreds of candidates and only a handful of voters. Some voting methods will produce a lot of ties, just because there are so few votes to go around that everyone will get one. These might be perfectly good voting methods for most elections, they just fall down on this edge case.

It looks like First Past the Post will be vulnerable to ties – if one list had ranked The Black Keys a little lower, we’d have one here. But it doesn’t matter, since we’ve already rejected it for failing to elect the Condorcet Winner. FAIL.

Approval Voting

Here’s one of those ties now. In approval voting, every album which appears on a list at all gets 1 point, and the album with the most points is #1. (Second most points is #2, etc.)

So we start off with a 2-way tie for first, followed by a 3 way tie for third:

9 Arcade Fire – The Suburbs
9 LCD Soundsystem – This is Happening
8 Beach House – Teen Dream
8 The National – High Violet
8 Vampire Weekend – Contra

This is because we defined “approval” as “anywhere in the year end list”. In an actual approval vote, the voters would know beforehand how the votes would be counted and probably be more selective in who they vote for. These lists are not really saying “any of these 25 or 50 albums would be ok by us as Album of the Year”. Really, they would probably vote for their top 3 or so (or some would vote for their top 3, some for their top 5, some would vote only for their favourite, etc.) If everyone approved only their top 3, we’d get:

6 Arcade Fire – The Suburbs
4 Kanye West – My Beautiful Dark Twisted Fantasy
2 Beach House – Teen Dream
2 Deerhunter – Halcyon Digest
2 The Black Keys – Brothers
2 These New Puritans – Hidden
1 Caribou – Swim
1 Elton John and Leon Russell – The Union
1 Gil Scott-Heron – I’m New Here
1 John Grant – Queen of Denmark
1 LCD Soundsystem – This Is Happening
1 MGMT – Congratulations
1 Plan B – The Defamation of Strickland Banks
1 Robert Plant – Band of Joy
1 The National – High Violet

So now we have Arcade Fire at #1, Kanye West at #2 (both seem reasonable) and a 4-way tie for #3. And a 9-way tie for #7. And no way to rank anything after that.

In a real election, Approval Voting is better than First Past the Post because it has less need for tactical voting – for instance, take Rough Trade. The top of its Best Of list looks nothing like anyone else’s. If Rough Trade were a voter trying to actually influence an election, they would know (based on polls and publicity) that voting for Caribou, or Gil Scott-Heron, or These New Puritans, was useless – they have no hope of winning. They might even have picked up enough from the media to know that it’s shaping into a contest between Arcade Fire (who they rank 21) and Kanye West (who they hate – they didn’t even rank him). So they might be tempted to hold their nose and vote for Arcade Fire just to make sure Kanye West doesn’t win. It would be a tough choice, though, because what if their preferred candidates have a lot of underground support that isn’t getting media attention? With Approval Voting, Rough Trade could vote for their top 3 or 5 (or however many they wish) to show their support, plus throw in a vote for Arcade Fire just to make sure they have at least one vote that isn’t wasted. (You could say that this is still voting tactically, but Approval Voting at least gives more and better options for tactical voting.)

However, Approval Voting isn’t guaranteed to elect the Condorcet Winner – it depends entirely on how the voters choose to define “approval”. The various preferential ballot methods are clearly better at selecting the correct winner, because they let each voter give more information about their preferences. To balance this, Approval Voting is much easier to explain and count – you don’t even need a computer to count the ballots! So for a general election it may be a fair choice.

Regardless, it doesn’t work for our purposes, due to the number of ties we get when there are so many more candidates than there are ballots (which wouldn’t be a problem in a real election). FAIL.

Smith/Minmax

This one does need a computer to count. Using the ballots.txt file we generated in Part 3, we generate results with:

voteengine.py -m s//minmax sminmax-data.txt

This will think for a minute or two and then spit out “sminmax-data.txt”, a file containing a bunch of data about how it counted the votes, ending with the final results, in a line in the same format as the ballot:

71 > 41 > 84 > 167 > 196 > 230 > 256 > 258 > 119 > 1 > 20 > 171 > 26 > 73 > 188 > 214 > 3 > 208 > 37 > 247 > 23 > 15 > 162 > 114 > 217 > 240 > 201 > 154 > 142 > 102 > 161 > 143 > 116 > 13 > 185 > 78 > 147 > 62 > 183 > 220 > 223 > 137 > 210 > 88 > 18 > 244 > 67 > 118 > 211 > 81 > 259 > 6 > 86 > 229 > 122 > 197 > 47 > 180 > 257 > 108 > 145 > 35 > 233 > 176 > 141 > 101 > 231 > 56 > 58 > 212 > 103 > 129 > 40 > 204 > 24 > 42 > 252 > 195 > 39 > 187 > 253 > 239 > 218 > 98 > 105 > 21 > 155 > 138 > 33 > 205 > 79 > 243 > 100 > 16 > 19 > 226 > 199 > 224 > 38 > 131 > 169 > 173 > 207 > 68 > 163 > 134 > 96 > 245 > 120 > 72 > 193 > 8 > 249 > 9 > 66 > 123 > 209 > 90 > 153 > 255 > 236 > 10 > 202 > 178 > 121 > 127 > 242 > 53 > 82 > 159 > 237 > 182 > 2 > 30 > 189 > 250 > 148 > 44 > 126 > 170 > 221 > 177 > 29 > 12 > 248 > 174 > 112 > 92 > 50 > 109 > 139 > 151 > 34 > 94 > 146 > 52 > 99 > 117 > 89 > 110 > 28 > 140 > 150 > 45 > 190 > 251 > 106 > 65 > 104 > 175 > 203 > 46 > 61 > 95 > 149 > 70 > 115 > 191 > 235 > 135 > 234 > 85 > 22 > 17 > 184 > 130 > 132 > 260 > 43 > 136 > 213 > 200 > 80 > 111 > 157 > 216 > 181 > 14 > 198 > 164 > 238 > 49 > 97 > 246 > 25 > 75 > 63 > 36 > 133 > 107 > 32 > 124 > 165 > 91 > 179 > 11 > 254 > 158 > 77 > 54 > 222 > 74 > 232 > 125 > 152 > 113 > 215 > 144 > 83 > 5 > 31 > 227 > 186 > 64 > 57 > 168 > 76 > 69 > 261 > 27 > 228 > 93 > 225 > 206 > 241 > 166 > 59 > 192 > 160 > 4 > 60 > 128 > 219 > 51 > 87 > 48 > 194 > 156 > 172 > 7 > 55

Those are the candidate numbers of each album. To get a human-readable list out of that, we need to look up the name of each ballot. Remember that when we generated ballots.txt, we also saved the candidate names to candidates.txt – the name of candidate 1 is on line 1, candidate 2 is on line 2, etc. So we can write another simple python script, that reads candidates.txt and stores a map of candidate number to candidate name, and then reads the last line of sminmax-data and looks up each candidate name.

The Python script.

Save this script as “interpret-result.py”, and feed the last line of sminmax-data.txt into it with:

tail -n 1 sminmax-data.txt | ./interpret-results.py > sminmax-results.txt

Now open up sminmax-results.txt and look at the list:

Broken Bells – ‘Broken Bells’
John Grant – ‘Queen of Denmark’
Abe Vigoda – ‘Crush’
Against Me! – ‘White Crosses’
Ali Farka Toure & Toumani Diabate – ‘Ali & Toumani’
Allo Darlin’ – ‘Allo Darlin’
Aloe Blacc – ‘Good Things’
Am – ‘Future Sons And Daughters’
Antony and the Johnsons – ‘Swanlights’
Arcade Fire – ‘The Suburbs
Ariel Pink’s Haunted Graffiti – ‘Before Today’
Avey Tare – ‘Down There’
Avi Buffalo – ‘Avi Buffalo’
Band of Horses – ‘Infinite Arms’
Baths – ‘Cerulean’
Beach Fossils – ‘Beach Fossils’
Beach House – ‘Teen Dream’
Bear In Heaven – ‘Beast Rest Forth Mouth’
Belle and Sebastian – ‘Write About Love’
Ben Folds & Nick Hornby – ‘Lonely Avenue’
Best Coast – ‘Crazy For You ‘
Big Boi – ‘Sir Lucious Left Foot: The Son Of Chico Dusty’
Big K.R.I.T. – ‘K.R.I.T. Wuz Here’
Black Angels – ‘Phosphene Dream’
Black Rebel Motorcycle Club – ‘Beat The Devil’s Tattoo’

Woah. That ain’t right.

Broken Bells was ranked 5th by NPR and 11th by Rough Trade. And that’s it. There’s no way they should be anywhere near the top 3.

John Grant was ranked 1st by Mojo – so he’s got that going for him – and 6th by Q. And that’s it. Again, no way he should be ahead of Arcade Fire and Kanye West.

After that it starts spitting out albums in alphabetical order. Remember that in ballot.txt we specified that we’d break ties alphabetically. So this indicates that all the remaining ballots are tied for 3rd – or tied for last, depending how you look at it. That’s not useful at all.

This looks to me like VoteEngine’s s//minmax algorithm is buggy, because these results are just too weird to explain any other way. But life’s too short to debug it when there are 9 other algorithms to test out. FAIL.

Tomorrow, I’ll start going through algorithms that work fairly well, and start looking for the best.


Syndicated 2011-01-10 02:58:05 from I Am Not Charles

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!