Name: Matt Brubeck
Member since: 2001-12-24 06:03:41
Last Login: 2010-02-06 00:24:37
Homepage: http://limpet.net/mbrubeck/
Notes:
I am a contributor to Audacity, a free sound editor and recorder. I work as a programmer in Seattle, WA.
You can see my home page for more information, or contact me at mbrubeck@limpet.net.
Finding SI unit domain names with Node.js
I'm working on some ideas for finance or news software that deliberately updates infrequently, so it doesn't reward me for checking or reloading it constantly. I came up with the name "microhertz" to describe the idea. (1 microhertz ≈ once every eleven and a half days.)
As usual when I think of a project name, I did some DNS searches. Unfortunately "microhertz.com" is not available (but "microhertz.org" is). Then I went off on a tangent and got curious about which other SI units are available as domain names.
This was the perfect opportunity to try node.js so I could use its asynchronous DNS library to run dozens of lookups in parallel. I grabbed a list of units and prefixes from NIST and wrote the following script:
var dns = require("dns"), sys = require('sys');
var prefixes = ["yotta", "zetta", "exa", "peta", "tera", "giga", "mega",
"kilo", "hecto", "deka", "deci", "centi", "milli", "micro", "nano",
"pico", "femto", "atto", "zepto", "yocto"];
var units = ["meter", "gram", "second", "ampere", "kelvin", "mole",
"candela", "radian", "steradian", "hertz", "newton", "pascal", "joule",
"watt", "colomb", "volt", "farad", "ohm", "siemens", "weber", "henry",
"lumen", "lux", "becquerel", "gray", "sievert", "katal"];
for (var i=0; i<prefixes.length; i++) {
for (var j=0; j<units.length; j++) {
checkAvailable(prefixes[i] + units[j] + ".com", sys.puts);
}
}
function checkAvailable(name, callback) {
var resolution = dns.resolve4(name);
resolution.addErrback(function(e) {
if (e.errno == dns.NXDOMAIN) callback(name);
})
}
Out of 540 possible .com names, I found 376 that are available (and 10 more that produced temporary DNS errors, which I haven't investigated). Here are a few interesting ones, with some commentary:
To get the complete list, just copy the script above to a file, and run it
like this: node listnames.js
Along the way I discovered that the API documentation for Node's dns module
was out-of-date. This is fixed in my GitHub fork, and I've sent a pull
request to the author Ryan Dahl.
Weekend hack: outline grep
I keep almost all of my notes and to-do lists in plain text files, so I can
edit and search them with Vim, grep, and other standard Unix tools. I often
indent lines in these files to create a simple outline structure, and use the
autoindent and foldmethod=indent options to make Vim into a simple
outliner.
To get useful output when searching through these outline-structured files, I
wrote a simple grep replacement. Given a text file with a Python-style
indentation structure, ogrep searches the file for a regular expression. It
prints matching lines, with their "parent" lines as context. For example, if
input.txt looks like this:
2009-01-01
New Year's Day!
No work today.
Visit with family.
2009-01-02
Grocery store and library.
2009-01-03
Stay home.
2009-01-04
Back to work.
Remember to set an alarm.
then ogrep work input.txt will produce the following output:
2009-01-01
New Year's Day!
No work today.
2009-01-04
Back to work...
You can download ogrep from the outline-grep repository on GitHub, or just read the literate Haskell file. The code is almost trivial (40 lines of code, plus imports and comments); I'm publishing it just in case anyone else has a use for it, and because some of my friends were curious about how I'm using Haskell. I've now written a few "real-world" Haskell programs (compleat was the first). I'm finding Haskell very well suited to such programs, though this particular one would be equally easy in a language like Perl, Python, or Ruby.
This is a one-off tool to fill a gap in my workflow; there are no configuration options or useful error messages. It would be fairly easy to extend it, though. For example, it might be handy to have an option to include children (as well as parents) of matching lines. I recently realized that ogrep often works for searching through source code too, which might generate some more unexpected use cases.
Android 2.0 ships with V8 JavaScript engine
Google has not yet released most of the Android 2.0 source code, but they did publish source for a very small number of components, including a WebKit snapshot. I was very excited to see that the snapshot includes Google's V8 virtual machine! (Previous Android releases used Safari's JavaScriptCore/"SquirrelFish Extreme" VM.) But without the rest of the source tree, there was no way to build and run this on a real Android phone. The SDK includes a binary image that runs only in the qemu-based emulator.
Today I got to try out a Motorola Droid. Here is how its browser compares to Android 1.6 on my HTC Dream (Android Dev Phone / T-Mobile G1) in the V8 Benchmark Suite (version 5):
| Test | Dream | Droid | Change |
|---|---|---|---|
| Richards | 13.5 | 15.6 | 16% |
| DeltaBlue | 5.23 | 12.9 | 147% |
| Crypto | 13.2 | 10.9 | -17% |
| RayTrace | 10.9 | 80.1 | 635% |
| EarleyBoyer | 23.5 | 74.7 | 218% |
| RegExp | did not complete | 16.5 | – |
| Splay | did not complete | did not complete | – |
Some tests (Richards, Crypto) see little or no improvement, while others (DeltaBlue, RayTrace, EarleyBoyer) are dramatically faster. Just for comparison, let's run the same benchmark on Safari 4 (JavaScriptCore) and a Chromium 4 nightly build (V8) on a Mac Pro:
| Test | Safari | Chromium | Change |
|---|---|---|---|
| Richards | 4103 | 4640 | 13% |
| DeltaBlue | 3171 | 4418 | 39% |
| Crypto | 3331 | 3643 | 9% |
| RayTrace | 3509 | 6662 | 90% |
| EarleyBoyer | 4737 | 7643 | 61% |
| RegExp | 1268 | 1187 | -6% |
| Splay | 1198 | 7290 | 509% |
The precise ratios are different, but the same tests that showed the most improvement from Android 1.6 to 2.0 also show the most improvement from Safari to Chrome. Based on this plus the source code snapshot, I'm pretty sure that Android 2.0 is indeed using V8.
This is exciting news. It makes Droid the first shipping product I know that uses V8 on an ARM processor, although V8 has included an ARM JIT compiler for some time now. For mobile web developers (like me), it means we're one step closer to having desktop-quality rich web applications on low-power handheld devices.
Final thought: Although the Motorola Droid is still 100 times slower than Chromium on a Mac Pro, it's actually faster at some benchmarks than IE8 on a low-end Windows machine, or Firefox 2 on hardware from just a few years ago.
Compleat: Programmable Completion for Everyone
Compleat is an easy, declarative way to add smart tab-completion for any command-line program. For a quick description, see the README. For more explanation and a brief tutorial, keep reading... <h2>Background</h2>
I'm one of those programmers who loves to carefully tailor my development environment. I do nearly all of my work at the shell or in a text editor, and I've spent a dozen years learning and customizing them to work more quickly and easily.
Most experienced shell users know about programmable completion, which provides smart tab-completion for for supported programs like ssh and git. (If you are not familiar it, you really should install and enable bash-completion, or the equivalent package for your chosen shell.) You can also add your own completions for programs that aren't supported—but in my experience, most users never bother.
When I worked at Amazon, everyone used Zsh (which has a very powerful but especially baroque completion system) and shared the completion scripts they wrote for our myriad internal tools. Now that I'm in a startup with few other command line die-hards, I'm on my own when it comes to extending my shell.
So I read the fine manual and started writing my own completions. Over on GitHub you can see the script I made for three commands from the Google Android SDK. It's 200 lines of shell code, fairly straightforward if you happen to be familiar with the Bash completion API. But as I cranked out more and more case statements, I felt there must be a better way... <h2>The Idea</h2>
It's not hard to describe the usage of a typical command-line program.
There's even a semi-standard format for it, used in man pages and generated by
libraries like GNU AutoOpt. Here's one for android, one of the SDK
commands supported by my script:
android [--silent | --verbose]
( list [avd|target]
| create avd ( --target <target> | --name <name> | --skin <name>
| --path <file> | --sdcard <file> | --force ) ...
| move avd (--name <avd> | --rename <new> | --path <file>) ...
| (delete|update) avd --name <avd>
| create project ( (--package|--name|--activity|--path) <val>
| --target <target> ) ...
| update project ((--name|--path) <val> | --target <target>) ...
| update adb )
My idea: What if you could teach the shell to complete a program's arguments just by writing a usage description like this one? <h2>The Solution</h2>
With Compleat, you can add completion for any command just by writing a
usage description and saving it in a configuration folder. The ten-line
description of the android command above generates the same results as my
76-line bash function, and it's so much easier to write and understand!
The syntax should be familiar to long-time Unix users. Optional arguments are enclosed in square brackets; alternate choices are separated by vertical pipes. An ellipsis following an item means it may be repeated, and parentheses group several items into one. Words in angle brackets are parameters for the user to fill in.
Let's look at some more features of the usage format. For programs with complicated arguments, it can be useful to break them down further. You can place alternate usages on their own lines separated by semicolons, like this:
android <opts> list [avd|target];
android <opts> move avd (--name <avd>|--rename <new>|--path <file>)...;
android <opts> (delete|update) avd --name <avd>;
...and so on. Rather than repeat the common options on every line, I used a
parameter <opts>. I can define that parameter using the same usage syntax.
opts = [ --silent | --verbose ];
For parameters whose values are not fixed but can be computed by another
program, we use a ! symbol followed by a shell command to generate
completions, like this:
avd = ! android list avd | grep 'Name:' | cut -f2 -d: ;
target = ! android list target | grep '^id:'| cut -f2 -d' ' ;
And any parameter without a definition will just use the shell's built-in completion rules, which suggest matching filenames by default.
The README file has more details of the usage syntax, and instructions for installing the software. Give it a try, and please send in any usage files that you want to share! (Questions, bug reports, or patches are also welcome.) <h2>Future Work</h2>
For the next release of Compleat, I would like to make installation easier by
providing better packaging and pre-compiled binaries; support zsh and other
non-bash shells; and write better documentation.
In the long term, I'm thinking about replacing the usage file interpreter with a compiler. The compiler would translate the usage file into shell code, or perhaps another language like C or Haskell. This would potentially improve performance (although speed isn't an issue right now on my development box), and would also make it easy for usage files to include logic written in the target language. <h2>Final Thoughts</h2>
Recently I realized that parts of my work are so specialized that my parents and non-programmer friends will probably never really get them. For example, Compleat is a program to generate programs to help you... run programs? Sigh. Well, maybe someone out there will appreciate it.
Compleat was my weekends/evenings/bus-rides project for the last few weeks (as you can see in the GitHub punch card), and my most fun side-project in quite a while. It's the first "real" program I've written in Haskell, though I've been experimenting with the language for a while. Now that I'm comfortable with it, I find that Haskell's particular combination of features works just right to enable quick exploratory programming, while giving a high level of confidence in the behavior of the resulting program. Compleat 1.0 is only 160 lines of Haskell, excluding comments and imports. Every module was completely rewritten at least once as I tried and compared different approaches. This is much less daunting when the code in question is only a couple dozen lines. I don't think this particular program would have been quite as easy to write—at least for me—in any of the other platforms I know (including Ruby, Python, Scheme, and C).
I had the idea for Compleat more than a year ago, but at the time I did not know how to implement it easily. I quickly realized that what I wanted to write was a specialized parser generator, and a domain-specific language to go with it. Unfortunately I never took a compiler-design class in school, and had forgotten most of what I learned in my programming languages course. So I began studying parsing algorithms and language implementation, with Compleat as my ultimate goal.
My good friend Josh and his Gazelle parser generator helped inspire me and point me toward other existing work. Compleat actually contains three parsers. The usage file parser and the input line tokenizer are built on the excellent Parsec library. The usage file is then translated into a parser that's built with my own simple set of parser combinators, which were inspired both by Parsec and by the original Monadic Parser Combinators paper by Graham Hutton and Erik Meijer. The simple evaluator for the usage DSL applies what I learned from Jonathan Tang's Write Yourself a Scheme in 48 Hours. And of course Real World Haskell was an essential resource for both the nuts and bolts and the design philosophy of Haskell.
So besides producing a tool that will be useful to me and hopefully others, I also filled in a gap in my CS education, learned some great new languages and tools, and kindled an interest in several new (to me) research areas. It has also renewed my belief in the importance of "academic" knowledge to real engineering problems. (I've already come across at least one problem in my day job that I was able to solve faster by implementing a simple parser than I would have a year ago by fumbling with regexes.) And I'll be even happier if this inspires some friends or strangers to take a closer look at Haskell, Parsec, or any problem they've thought about and didn't know enough to solve. Yet.
Colophon
<h1>Colophon</h1>This site is powered by Jekyll and based on styles and markup by Tom Preston Werner. Comments are run by Disqus and hosting is by Dreamhost.
Before starting this site, I had an Advogato diary for writing about software. I also have a personal journal (mostly interesting to my friends and family).
mbrubeck certified others as follows:
Others have certified mbrubeck as follows:
[ Certification disabled because you're not logged in. ]
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!