5 Aug 2002 graydon   » (Master)

proofs and programs

it is good to think of programs as logical terms, because you can then consider the "specification" or "interface" of a program as some logical term which is implied by another term, say the implementation of the program or the assembly language of it. these are all terms within the same logic, some using a restricted vocabularly but all logically related. the act of "writing a program" is then viewed as "locating a more syntactically restricted term which logically implies the more abstract specification".

this is a useful view since it eliminates the pointless distinction between specification and programming language. you view the programming language as the subset of the specification language that you happen to have hardware which can follow.

for example, say I have a specification / program of the form:

	for i from 1 to 10: x[i] := x[i] + 1;

there is no "direct" encoding of that sequence of symbols as something my hardware will try to follow. but there is another program which logically implies the first one:

	load r1,1
top:	load r2,r1
	subtract r2,r2,10
	branch-if-zero r2,end
	load r3,x
	load r2,r1(r3)
	increment r2
	store r1(r3),r2
	increment r1
	jump top
end:	

and this, given a small change-of-encoding into machine words, is a specification my hardware will follow. but note the relationship: the pseudo-assembly implies the pseudo-C, not the other way around. it implies it, not in the informal sense of "it's a reasonable translation", but rather in the formal sense of a computer with memory that can assume various states, and control that passes from instruction to instruction, with the post-state of one instruction being the pre-state of the next. what if your logic doesn't have the symbols "jump" or "for" or ":=" in it? well, it's time to change to a logic that does.

document formats

three things worth pointing out wrt. document formats:

  1. editability is a factor worth paying close attention to. many users want to be able to "freeze" a document and make it non-editable, either in the name of preserving "pristine content" or (benevolently) by composing content using a higher-level abstraction than the delivery format.

    this has similar moral character to compiling source code, insofar as derived documents are very difficult to produce. many people think that derived documents never happen in the real world; I would advise such people to consider how much worse the web would be without "view source".

  2. machine-readability is also an important factor. if you assume many document retrieval tasks in the future will be partially machine-assisted (beyond merely presentation) then you've got to keep things like citeseer, google, and glimpse in mind when working on document formats.

  3. for non-editable, mostly presentation-oriented documents, the djvu format is very good.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!