Recent blog entries for danf

14 Oct 2002 (updated 14 Oct 2002 at 05:21 UTC) »

'Multi-Dimensional' Trust Metrics
I think what some people here are feeling around for are 'multi-dimensional trust metrics', or even better - trust metrics built on a quantale which is a little more interesting than the positive real numbers.

An immediate advantage of multi-dimensional trust metrics is that you can rate different qualities simultaneously. For example, you can rate someone's theoretical skills, or their skills in Perl, or Python. Then, if you like, you can project these skills onto a single scale in any way that you like - for example, you may be interested in people who are good at Perl and Python so you can take PERL+PYTHON or PERL*PYTHON, or some other increasing function of the individual variables as your measure of how good this person is on the scale.

Why do we need to think about more than just multiple dimensions (which could just as easily be some with several separate metrics)? Here's an example:

In the case of a rating system (like Advogato)- the fact that I trust someone's programming skills doesn't necessarily mean that I trust them to be a good judge of the qualities/trustworthiness of others. This is relevant because the degree to which I trust the people that they trust is actually determined by how much I trust their ability to rate people. Thus the transitivity of trust in this system is not simply the 'product' of the transitivity of two (uninteracting) metrics.

For people who are really interested - you can of course do logic on quantales!

Advogato's trust metric
I don't know if anyone else has noticed, but the definition of Advogato's turst metric given doesn't result in a determined answer. In most situations there will be many different maximal flows, each of which will result in a different result of who is certified and who isn't. Ford-Fulkerson makes no claims about which flow will result and neither does the description of Advogato's metric calculation. I can't be bothered reading through the code to see what actually happens, so I'll just ask you learned people to do it for me ;)

14 Oct 2002 (updated 14 Oct 2002 at 06:34 UTC) »

I hate being sick. Happily, I'm mostly better now.

Dispite ignoring my email to him a few months back when about making a math encyclopdedia, Raph seems to be more interested in one now. Raph, for the ultimate reference to Categories try Categories for the Working Mathematician by Saunders MacLane. This book tells you the really important stuff and uses lots of examples from all over mathematics. What you might also be interested in is Toposes, which are basically a categorical theoretic response to foundationalism in mathematics by trying to express Set Theory categorically. Here's a five minute introduction: Topos Theory in a Nutshell.

Went to the BADMath day yesterday. Lots of fun with really good talks. There was one paritcular talk by Herbert Wilf about finding closed form solution to summations which is worth mentioning here. The system is nice, fast, and works well. Positive results have one line proofs (checkable by eye, or your favourite symbolic algebra package), but for negative answers you have to trust the programme. He mentioned the Encylopedia of Integer sequences. Even though many people use it, they don't seem to realise that it is a conjecture generator. You give it a sequence and it conjectures that it is a particular sequence (actually it usually gives a number of conjectures).

The Math Encylopedia
Which takes us back to the Dictionary/Encyclopedia of Math. QED is probably the pin-up for this kind of project. Everyone interested in such a system should visit this site. It seems to have been abandoned a number of years ago, but obviously some people put some serious thought into what such a system would entail. As far as I know it was still in the planning stages when it died/went into hibernation.

Paul pointed out to me that the system is so big, and will require so much computation, that it will be difficult or impossible (or at least not optimal) to host it at a single location (or set of mirrors). Robustness is also an issue, so having the data and computation spread out seems like an excellent idea. Back to this again soon.

Which logic to choose?
What about the basic logic. There is some discussion on the QED site about which logic should be chosen as the basis on which to build the proofs. Which logic and axioms should we choose? Do you like to use the 'Axiom of Choice' or it's negation? Do you like the 'law of excluded the middle'? What do you like to base your proofs on anyway - ZFC, Peano arithmetic, Toposes? Rather than fight about which is best, it seems much easier to simply allow people to choose whichever one they like best at the time.

This immediately raises the question of what we then mean by a "True Statement" and a "Correct Proof". Some (very easy to understand ) statements are true in ZFC (Zormello-Frankel axioms plus the axiom of choice) which are not true in 'ZFnoC' (ZF with the negation of Choice).

Many people here seem to like thinking about trust metrics, so to put it in those terms, what we want is multiple (and customisable) metrics which can be selected by the user. For example, if I want to know if a particular theorem is true under the assumptions of ZFC, I'm simply searching for a proof of this theorem which is trusted by the axiom-system object 'ZFC'. You might also want to search for all theorems which use the following ideas/definitions (either directly or indirectly) and are True in both ZFC and ZFnoC.

Blind and not-so blind faith
Now to the distributed nature of an encylopedia. We're interested in verifying proofs of statements which we have found. How can we do this? For a fully machine-checked proof, you need to trust the machine and programme which are telling you that a statement is true.

However much dancing you do up high, at some point down at the bottom you have to place some blind faith in the person who wrote and checked the orginal algorithm, in the fact that the compiler was correct, that it wasn't corrupted en-route to the executing machine, that the machine executed it correctly and that the answer was communicated to you correctly. Of course, some of these can be trusted more than others.

To minimise these uncertainties, you can encrypt and error check when transporting information, you might use a 'proven' virtual machine running on a proven operating system running on a respected piece of hardware. You might also check the programme youself, compile it yourself (did you check the compiler youself?), and run it on your own 'trusted' machine.

If you're willing to accept a tiny bit more uncertainty however you might as well trust that the crypto-signed message comming from the AMS server saying that a statement is true/proof is correct is sufficiently trustworthy. If you're not quite that easily convinced then you might want to ask several of your most trusted sites to verify the statement, and then verify it yourself with a programme which you trust for some reason; but where did you get the programme from? Who or what checked it and how much do you trust them?.

The point of all of this paranoia is not to be paranoid, but to recognise that you will at some level have to have some blind faith. How blind it is is up to you. No matter how strict or lax you want to be at any time, it seems sensible to quantify how much you trust the statement/proof in front of you (or the sources which are certifying the statement).

So we come again to trust metrics. This time we're interested in how much we trust a proof or a mathematical, or a proof/certificate saying that a particular piece of code has been proven safe. These issues are not new (for paranoid computer users), but we now have the possiblity for being much more sure than ever before. (and sure of how sure we are).

For example, your reasoning for your trust in a proof might go as follows. There are ten major steps in this proof. I have personally checked that if each of these is correct then the proof is correct (how tired am I today?). My friend checked them too, as did various other mathematicians. I have a programme X which has been certified correct by source Y. Programme X says that these ten statements will together give me the theorem. Source Y consists of a certificate signed by the AMS server on a particular data, proporting to run a well accepted and hand-checked proof/programme-checker. For each of my ten major sub-result used in the ten steps, I look them up in the encylopedia (and download certificates for them). If I can't find a certificate I trust enough, then I run my own proof-checker over the proof, hopefully this gives me a higher.

Combining all these degrees of trust is not as simple as it might seem at first. If you only trust a particular proof-checker W to be 99% right then if you get many different certificats which all use this programme then you are still only 99% right. If on the other hand you also trust proof-checker Z to be 99% right (and sufficiently independent of W) then you might end up being 100(1-(1-99/100)^2) percent sure that the statement is correct.

4 Oct 2002 (updated 4 Oct 2002 at 04:33 UTC) »
An Encylopedia of Mathematics So I've been thinking more and more about an 'encyclopedia of mathematics' recently. It's been bouncing around in my head ever since I heard about Wikis, but now I'm actually going to start doing something about it.

The aim of such an encylopedia would be twofold. First it would be a quick and easy reference to any mathematical concepts which come up (saving lots of time for anyone who's learning or using mathematics which is new to them), and as a base upon which to build complete computer-verifiable (and also human-readable) proofs, run automated proof-checkers, proof generators and conjecture generators.

Sound useful?

By 'encyclopedia of maths' I mean a large system of objects, each representing a mathematical idea such as a definition, theorem, proof, or even an explaination or article. Each of these is linked to the objects representing the ideas upon which it relies (either explicitly or implicitly). An object might be an axiom, a defintion, a theory, a theorem, a conjecture or a proof.

For example, a theorem object would probably contain

  • a natural-language statement, with a reference to another object in the encylopedia for each technical term used.
  • references to any proofs which are in the encyclopedia
  • a machine-understandable statement of the result, again with references to every mathematical object involved.
  • parhaps a natural-language comment, including information such as the source of the theorem, papers it appears in, the subject(s) that it forms a part of (with references as always).

Another useful feature of such a system would be a thesaurus. Mathematical notation varies wildly across the sciences (try comparing notation for groups used by physicists and mathematictians), so a quick and painless way to translate from one language/notation to another and back would allow people to actually read material written by people from a different discipline.

There's actually been quite a lot of work done on most of the components which would be required for such a system. To try and encourage more people to take an interest in this sort of project I'm putting up a little website about it (with lots of juicy links): An Encyclopedia of Mathematics

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!