My trust metric has gone down a storm on LJ - about 18,000 people have used it, and that would probably be more if the tiny machine it's running on was beefier. I've made thesource code available now.

Name: Paul Crowley

Member since: 2002-07-02 13:54:52

Last Login: N/A

Homepage: http://www.ciphergoth.org/

**Notes:**

I'm a cryptologist,and a longstanding user and advocate of free software but my contribution in lines of code is negligible. You'll probably find my pages about crypto the most interesting.

My main diary is on LiveJournal; this diary is for taking part in Advogato discussions.

**PGP/GPG key fingerprint:** 3D8298B6 C0721685 53AF021D 4D2917FA

Syndication: RSS 2.0

My trust metric has gone down a storm on LJ - about 18,000 people have used it, and that would probably be more if the tiny machine it's running on was beefier. I've made thesource code available now.

I proposed a simple trust metric on LiveJournal, designed to work even when you have only partial information about the trust graph.

Post comments there - I don't read Advogato diaries any more because you can't comment on them.

Bram writes about partial busy-work functions. I need to know more about the requirements for generation and verification before I can see how to attack it. The only requirements I have are these:

- There is a graph with n nodes
- Each node is associated with a puzzle, and there's a puzzle for the whole graph
- Each node puzzle should take k work to solve
- The whole graph puzzle is easy to solve if you have the solution for any two nodes on the same arc.
- The solution is the same no matter which arc you use.

I can see several trivial solutions, but they have these disavantages: it takes as much work to verify a solution ans it does to generate it, and generating all the puzzles takes nk time. Usually you want a busy-work puzzle to be trivial to generate and verify, eg "find a 160-bit value that hashes to something with the prefix 72f3ab81". Bram, what are your reqirements with this?

**Bayes and scoring**

I've got a better grasp on what these probabilities really mean, and why (as I wrote last time) (a o_s b) = (a o b o (1-s)). If anyone's reading, mail me, I'd be happy to write it up...

I wonder if anyone replied to what I last wrote about Advogato diaries and LiveJournal? I didn't come back in time to check everyone's diaries and find out, so I don't know. If you replied, can you mail me and let me know?

Needless to say, I don't have this problem with LiveJournal, because replies are attached to the comment they reply to and I receive email alerts about them.

**Bayes and scoring**

raph discusses Paul Graham's plan to use Bayesian statistics to defeat spam and points out that his combiner is isomorphic to a linear one. It turns out that Graham's combiner assumes that the a priori probability that a mail is spam is 0.5, which isn't right, but you can easily fix the problem without sacrificing linearity.

For those who like me were confused by what "a probability of 0.99 spam" means, it means Pr(S|A) where S is the event "the mail is spam" and A is the event "the mail contains the word viagra". The extent to which A and B together indicate that the mail is spam is given by

Pr(S|A and B) = (a o b) = ab/(ab + (1-a)(1-b)) (where a = Pr(S|A))

but this formula is only correct if you assume that

Pr(S) = Pr(not S) = 0.5

and that the events are independent apart from their dependence on whether or not it's spam:

Pr(A and B|S) = Pr(A|S) Pr(B|S)

Pr(A and B|not S) = Pr(A|not S) Pr(B|not S)

Assuming that the events are independent is fundamental to the whole approach, but assuming Pr(S) = 0.5 seems to me like a mistake. Interestingly, though, if my math is right then the correct combiner for arbitrary a priori Pr(S) = s is (a o_s b) = (a o b o (1-s)) (do the math, or mail me and ask me to post it here). Thus if you define f(x) = log(x) - log(1-x) as before so that f(a o b) = f(a) + f(b), then you can define

f_s(x) = f(x) - f(s)

and you find

f_s(a o_s b) = f(a o b o (1-s)) - f(s)

= f(a) + f(b) + f(1 - s) - f(s)

= f(a) + f(b) - f(s) - f(s)

= f_s(a) + f_s(b)

This makes sense if you think about it: if a = s, then Pr(S|A) = Pr(S) so A tells you nothing about whether or not it's spam, and f_s(a) = f(s) - f(s) = 0 so it contributes nothing to the score either way.

So you don't need to assume that the a priori probability of spam is 0.5 to make raph's linear approach work.

Who here (apart from skx) keeps a diary on LiveJournal? I often find myself wishing that the diaries kept here were there. Despite diary rating and the trust metric, I find it in practice much easier to find diaries I'm interested in there than here. I miss the "friends list" facility that allows me to directly choose which diaries to read. And I miss over and over again the facility to make comments on diary entries; where I could make a useful and informative comment, the fact that it will have to appear in my own diary rather than as a comment on another entry puts me off.

Is that feature missing simply because it ain't written yet, or is it a deliberate piece of social engineering? I'm sure there must be some who prefer it that way, but I'd be interested to know why.

I quite often think about what would be needed to really unite all the blogs in the world, so that I can have one "friends page" that follows everything I'm trying to follow. At the moment there's a severe network effect that makes a real LiveJournal account much more useful than accounts with any other provider, even those (eg uJournal) which use the LJ code base. RSS is a component, but it would need a blog-specific module to deliver the same level of usefulness.

**Graph Isomorphism**

I think there's widespread doubt about the hardness of this problem; ISTR caveats along these lines in descriptions of the associated zero-knowledge proof.

ciphergoth certified others as follows:

- ciphergoth certified mhamrick as Journeyer
- ciphergoth certified Stevey as Journeyer
- ciphergoth certified daw as Master

Others have certified ciphergoth as follows:

- Denny certified ciphergoth as Journeyer
- Stevey certified ciphergoth as Journeyer
- madhatter certified ciphergoth as Journeyer
- leviramsey certified ciphergoth as Apprentice
- Omnifarious certified ciphergoth as Journeyer

[ Certification disabled because you're not logged in. ]

**New HTML Parser**: The long-awaited libxml2 based HTML parser
code is live. It needs further work but already handles most
markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!