Advogato: Blog for Bram

InstallShield

InstallShield is still a 16-bit application. This means that on long-running 32-bit Windows machines (like mine) by the time you try to install anything which uses InstallShield the 16-bit VM will inevitably have long since crashed and the installer will simply not work.

Those of you using InstallShield should be aware that a very large fraction of all your end user installs are failing due to a problem which can't be fixed without InstallShield completely rewriting their software or you switching to a different installer. I recommend Nullsoft Installer.

Negative Certs

In any trust metric in which you have negative certs there's an attack in which an attacker creates many fake identities and certifies them. The fake identities may get negative certs attached, but it will still be possible to make more. Therefore, any trust metric which includes negative certs must back propogate negativity in its evaluation function to be effective. I just realized this yesterday and haven't put any real thought into how back propogation should be done. Clearly more thought is necessary.

Simple Trust Metric Evaluation

Here's a trust metric evaluation algorithm which fixes the problems of very low confidence levels advogato has right now.

First we must select a paranoia level. This is the assumed probability that any given given peer is compromised. A paranoia level of zero is extremely trusting, while a paranoia level of one never believes anything.

To determine Alice's rating of someone's diary, do multiple runs of the following:

Decide whether to remove each node from the graph with probability equal to the paranoia level.
Trim all dead ends (that is, nodes which have no path to a node which gives a rating) from the graph.
Starting at Alice, do a random walk along trusted edges through non-removed nodes until you hit a node which gives a rating, then stop. It's possible that there is no such path, in which case this run is counted as having failed.

The confidence value is now the fraction of all runs which succeeded. The diary rating can be computed from the distribution of run results, either by taking the median or the mean.

Computation can be greatly sped up by figuring out exact probabilities for each run of removing nodes using linear equations. Disturbingly, it's still randomized, but approximates the exact value reasonably quickly except in the case where confidence is near zero, in which case rating doesn't really matter.

Raph pointed out that this approach involves combining three already known techniques -

random node removal
trimming dead ends
random walk

The above can be used in just about any combination, but using all three has the best properties.

Looking back on Raph's diary I see that I meant to post about this on the 19th of December, after discussing it with him after dinner. I'm a bit behind on my blogging.

Slashdot Trust Metric

Slashdot's comment rating system appears reasonable at first blush, but is deeply flawed in implementation. Happily, there are straightforward ways it could be vastly improved.

First the good part. Moderator points are a good approach, and the strength of them as a technique is largely responsible slashdot's comment system achieving its current level of usability, which to its credit is considerably better than usenet. (I do not mean than as an insult.) In particular, the policy of never giving a particular person more than a few moderator points at a time strongly limits the amount of damage an attacker could do and also allows the size of the community to be very precisely tweaked.

The problems with slashdot aren't so much with what should be there as what shouldn't. The current meta-moderation interface is completely unnecessary. Whenever anyone rates a post up or down, other people have the opportunity to use their moderation points to change that rating, which implicitly meta-moderates the first change. Analysis of that data could be used as the exclusive source of meta-moderation with good results.

Categories for reason of moderation are also pointless. They complicate the UI without adding any new functionality.

Without any data on how it's currently done, I can't comment on specific formulas for deciding when to give people moderation points. Clearly making posts which get modded up should increase likelihood/frequency of getting moderation points, and the same with positive meta-moderation. Coming up with specific formulas seems to be an interesting and soluble problem, but real-world data is necessary.

Perfect Graphs

There's now an algorithm for testing if a graph is Berge in polynomial time. (Read the links for an explanation of the Berge property, it's straightforward but a bit long to repeat here.) Although I haven't read through the papers yet (which I plan to do in my copious free time) I'm under the impression that the algorithm, while distinctly non-trivial, only employs elementary techniques. It's always nice when new results can be understood without extensive background reading.

1 Mar 2003 Bram » (Master)