Advogato: Blog for adubey

Random thoughts... does anyone else find advogato's interface a bit lacking?

Often, I find I have to click through 2-3 pages to do common things like post a diary entry. In general, I feel like it's harder than it should be to do the things I want to do, or that I do often. While this feeling might go away after I get used to it, perhaps it isn't as welcoming to new users.

Other random thoughts... sometimes people post diary entries that have interesting ideas that are worth discussing. While email is a suitable route for that, wouldn't it be cool if you could reply to those?

Even more random... first there were linear news readers, then there were "threaded" news readers. These are essentially trees, where each node has one parent. Now, the web does threaded discussions one up: we have directed graphs instead of trees. But short of a WikiWikiWeb-type CGI program, the web is geared to "read this" rather than "comment on this". Mightn't it be cool if you had some kind of "threaded" system where posts may have more than one parent? Someplace where you could bring separate discussions that are mutually relevant together instead of forever splitting them. For example, a discussion hanging off an article with one hanging off a diary entry. Ah... if I only I had the time... anyways, I'm getting far too random for my own good.

Anyways, my prob parser is stuck right now; the training data is in an ass-backwards format in which words aren't necessarily given a part-of-speech. In other words, rules are in the form Nonterminal->(Terminal|Nonterminal)* rather than Nonterminal->Terminal* | Nonterminal->Nonterminal*. Of course, I could split things up myself (by putting the grammar in CNF), but then there I will loose some generalizations (ie in AP-> NP and NP | AP-> VP and VP NP, each "and" will get a different non-terminal, but I want only one. Sometimes the POS will be different, so I can't say "always make 'and' an adjunct".) Grr... I could use a part-of-speech tagger, but then I have to 1) link the 'C' POS tagger to ML or 2) play around with the training data's nonterminals to be compatible with the tagger's tags. Alternatively, I could shell out $2500 for a good training set...

Older blog entries for adubey (starting at number 1)