Older blog entries for connolly (starting at number 108)

8 Jul 2012 (updated 12 Jul 2012 at 05:10 UTC) »

Nokia Lumia 710 with Windows Phone 7 is an eye-opener

At first, the conversation was about a new iPod touch vs. a mobile phone. The battery in my youngest son's aging iPod touch lasts about half an hour now, and his birthday is coming up. He was leaning toward a new iPod touch, but the reason we get mobile phones for our boys is so that mom's taxi service can reach them, and the lack of a phone for him has caused some issues in that area recently.

So we took him to the T-Mobile store to shop for a phone. We're OK to pay a few more dollars a month to add him to our family plan (especially since we can drop the land line) but we're not getting him a data plan.

We had just about decided on some Samsung talk-and-text model when the clerk said "I can show you a touchscreen phone that doesn't require a data plan." It was the Nokia Lumia 710 with Windows Phone 7. The price? Just $50* (with the usual 2 year contract).

I have spent a couple decades avoiding the influence of Microsoft in my life, and especially in the Web, but Microsoft is motivated to be more open and interoperable in the mobile space, since they don't dominate it.  Plus, a good friend of mine gushes about his new Windows Mobile phone, a complete turn-around compared to his endless gripes and frustrations with his original Windows Mobile phone. So I was open to it. But even $50 is $40 more than the other phone, so I asked my son if he was sufficiently interested to contribute a certain chunk of the price. Yes, he said, without hesitation, and we went for it.

This thing has all the "wow! it can do that too?!" of my Samsung Vibrant with Android 2.2 and none of the "oops... hey! what? grrr..." surprise and frustration and waiting. The back button is as quick as it used to be on the sidekick/hiptop. I don't know why Android can't cache web pages worth a lick; didn't Andy Rubin design both platforms?

The one bit of frustration is by design: until his birthday actually arrives, windowsmobile.com won't let him install any apps.

So migrating his contacts was a bit of an adventure. We ended up using python-idevicesync on my linux box to get them in a vCard file for uploading into his google account (I set up google apps for domains for our family a few years back). Then the phone knew how to get the contacts from there.

I didn't discover the shortest path to loading music right away. It has a micro-USB connector, but doesn't act like a flash drive. Evidently it speaks music transfer protocol (MTP). The up-side is that it doesn't need to re-scan the entire flash filesystem every time you connect it to a computer (or turn it on). MTP is supported by rhythmbox and lots of other open source music managers, but evidently not quite the dialect of MTP that Windows Phone 7 uses. When I tried to drag a bunch of tracks over, Rhythmbox would copy one track and then stop. And it wouldn't set the artist/album/track metadata right. Evidently it was silently discarding an error (grrr!). gMTP did better: it would report an error after each track, but when I acknowledged the error dialog, it would continue to the next track. It still didn't get the metadata right.

This exercise prompted me to resume the quest of cleaning up my music archive, including convincing Ubuntu to share files with Mac OS X again (netatalk seems to be dying; ugh... samba config! caramba!).

dupeguru Music Edition, where have you been all my life?!

It cleaned up thousands of duplicate tracks in my filesystem and even cleaned up dead tracks in my iTunes database. (Of course, I expected iTunes Home Sharing's ability to detect tracks that I already have to extend to the case of dragging and dropping the contents of a playlist, and I was wrong, so I have another batch of dups to clean up...)

I expected  to run into the same age restriction with Windows Phone 7 Connector for Mac as my son ran into with Zune on his netbook, but not so. I was able to use it to install a Windows Phone update, though it gave me a scare when it quit during the "do not disconnect" part of the update; I was mentally preparing to take the bricked phone back to the T-Mobile store when the phone rebooted and announced that the update was complete. Whew!

Syncing music worked with Windows Phone 7 Connector. It got the metadata right, but I think it excluded some songs due to DRM that were actually not DRM-encumbered.

I have had my eye on the Galaxy Nexus with Jellybean. $350 unlocked seemed like such a good deal, but now I wonder... do I really want to choose a phone based on my ability to tinker with it? With my Samsung Vibrant running Android 2.2, I'm constantly dreaming of ways to improve it. But that's because I'm constantly interrupted from what I was actually trying to do with the phone by some bug or performance issue.

Wasn't it Ed Dumbill who said "I don't want to sysadmin my phone." Maybe I'd be happier with the no-user-serviceable-parts-inside product that Nokia, Microsoft, and T-Mobile are offering for hundreds less.

*EDIT: It looks like the $50 price we got at a local T-Mobile store is not widely available. Amazon wants $300 and gives a list price of $500.

Syndicated 2012-07-08 22:56:00 (Updated 2012-07-12 04:40:42) from Dan Connolly

Imagine there's no ICANN... with namecoin and cjdns

The Web and the Internet are, by design, decentralized. Noteable exceptions are the allocation of DNS and IP addresses, both administered by ICANN. By and large we ignore this wart in the architecture, but this week ICANN showed up in the headlines of the newspaper outside my hotel room:

Companies anted up $185,000 per domain to apply for naming rights. ... ICANN, which has received 1,930 applicants, will have to sort out whose claims are strongest.

— Apple, Google, Microsoft, Amazon seek domains from ICANN By Scott Martin, USA TODAY

That's a cool $350M. Is that a healthy direction for the cyber-real-estate market? It doesn't smell like it, to me.

Many of us have long accepted this wart in the architecture because we didn't see any alternatives.

Recently, I've seen some alternatives:

In place of centralized administration of domain names, 

  • distributed/decentralized
  • : each user has its own copy of the full database
  • secure
  • : security (with public/private keys) is deeply integrated in the software to allow only the owner of a name to modify it in the distributed database.
  • pseudonymous
  • : all transfers of data are public and linked to random generated addresses
  • open
  • : anybody can use namecoin to register a name or to create its own Namespace

And, with the tip of the hat to zooko 6 Jun, in place of centralized administration of IP addresses:

Imagine an Internet where every packet is cryptographically protected from source to destination against espionage and forgery, getting an IP address is as simple as generating a cryptographic key, core routers move data without a single memory look up, and denial of service is a term read about in history books. Finally, becoming an ISP is no longer confined to the mighty telecoms, anyone can do it by running some wires or turning on a wireless device.
This is the vision of cjdns.

You may say I'm a dreamer, but I'm not the only one...

There are over  55,000 namecoin domains already.

Syndicated 2012-06-15 17:16:00 (Updated 2012-06-15 17:32:02) from Dan Connolly

Diplomacy, not technology

A quote from Gruber, via Norvig in 2009, on the sociology of developing ontologies:
In some domains, competing factions each want to promote their own ontology. In other domains, the entrenched leaders of the field oppose any ontology because it would level the playing field for their competitors. This is a problem in diplomacy, not technology. As Tom Gruber says, “Every ontology is a treaty—a social agreement—among people with some common motive in sharing.”13 When a motive for sharing is lacking, so are common ontologies.
13. "Interview of Tom Gruber," AIS SIGSEMIS Bull., vol. 1, no. 3, 2004.
Reminds me of the WWW2006 panel on tagging vs the Semantic Web, where I acknowledged that symbolic methods are no match for statistical methods when it comes to natural language processing and a lot of other tasks, I don't want computation of my bank balance crowd-sourced.

Medicine has a long tradition of controlled vocabularies, but they originate from billing systems. I suppose they have a role in evidence-based medicine, but that role isn't entirely clear to me yet. I'm pretty new to the field.

Syndicated 2012-02-28 20:14:00 (Updated 2012-02-28 20:29:34) from Dan Connolly

18 Feb 2012 (updated 22 Feb 2012 at 18:09 UTC) »

Saying Goodbye to Moore Method math notes and Robert Miner

I'm purging another box of files today: college math & C.S. notes, including a few dozen transparencies I prepared for classes on Topology and Fractals that Dr. Starbird and Dr. Cline taught using the Moore method:

Instead of using a textbook, the students are given a list of definitions and theorems which they are to prove and present in class ...
After Moore became an associate-professor at University of Texas at Austin in 1920, the Moore method began to gain popularity. Today, the University of Texas at Austin remains a strong advocate of the method and uses it in various courses within their mathematics department ...
Metric spaces, hausdorf spaces, cauchy sequences, attractors... I'm sure glad for Wikipedia, because I can hardly follow my own notes; most of it has leaked out.

I thought about capturing one or two formulas for posterity, which reminded me to try Web Equasion, which, amazingly, does handwriting recognition of LaTeX using JavaScript. (hat tip: @therealmaxf). I couldn't quite get it to completely recognize a 1-to-infinity sub/superscript notation, but I noticed the nicely typeset output was rendered with something I didn't recognize: MathJax. Cool! "an open source JavaScript display engine for mathmatics that works in all modern browsers." So I started looking into it...

... which is when I got the sad news about Robert Miner, who was co-chair of the Math Working Group, along with Patrick Ion for much of the time I was at W3C. I didn't work with him extensively, but the Math Working Group was always a class act.

Syndicated 2012-02-18 17:10:00 (Updated 2012-02-22 17:38:36) from Dan Connolly

13 Feb 2012 (updated 14 Feb 2012 at 03:39 UTC) »

Got my data back from Mint, thanks to GnuCash/mysql

My ideal personal accounting system would

  • support double-entry accounting, with budgeting, reports, and charts
  • have an open architecture with
    • an SQL back-end
    • a flat-file serialization of the data suitable for use with version control
  • integrate with the Web, both
    • allowing access from any machine with a web browser
    • syncing with banking web sites

After trying Mint for a year and a half, I realized that while Web integration is nice, it's no good without double-entry integrity. While GnuCash's UI isn't as nice as modern web apps, it lets me keep my data in SQL, which keeps my options open.

Before Mint, I used Quicken for decades. I stopped paying for updates after Quicken 2001 and hence lost bank syncing. But I did find a flat-file serialization suitable for use with version control (and no, QIF doesn't cut it. See my
March 2006 item). And while Wine continues to support Quicken 2001 after all this time, I don't have any API to update Quicken's store. So there's no going back to Quicken after Mint.

Mint has no concept whatsoever of double-entry accounting. It will give you a balance for your bank account at the beginning of each month and a list of income and expenses in between. You might think that the old balance + income - expenses = new balance. You would be wrong.

Mint fails to download a few credit card transactions on occasion, so it's unreliable auditing. It relies on the user to notice duplicate transactions, so it's unreliable for budgeting. As to the idea that Mint's categorization would save me work, Marc Hedlund put it this way in  Why Wesabe Lost to Mint :
I was focused on trying to make the usability of editing data as easy and functional as it could be; Mint was focused on making it so you never had to do that at all. Their approach completely kicked our approach's ass. (To be defensive for just a moment, their data accuracy -- how well they automatically edited -- was really low, and anyone who looked deeply into their data at Mint, especially in the beginning, was shocked at how inaccurate it was. The point, though, is hardly anyone seems to have looked.)
So I had to double-check the categorization. And since they lack support for any sort of transaction reconciliation, I had to cobble together something out of their tags to keep track of which transactions I had already reviewed. And sometimes, Mint just spontaneously threw away my work and changed the categories anyway. I know this because I carefully exported all my transactions in CSV format after each significant session and reviewed the diffs before checking them in to a version control repository.

The UI for splitting transactions is incredibly tedious. And once you have split a transaction, you can no longer search for the transaction by the total.

I had to resort to a Google docs spreadsheet to make up for limitations of Mint's budgeting. You can only budget for the current month. No longer term planning, and no retrospective changes to the budget. On November 31, you don't have all your spending info for November, since transaction data takes a few days to flow through banks and credit card systems. But on Dec 1st, Mint will no longer let you re-allocate budget funds between November and later months. As if plans were the important thing. "Plans are worthless, but planning is everything." -- Eisenhower

Mint has a notes field, but won't let you search them and trains you not to use them by deleting your work if you change any other field in the transaction.

I was willing to risk giving them my bank passwords, since I audit everything pretty carefully, but their security story is a boldface lie:

Mint is a "read-only" service. You can organize and analyze your finances, but you can't move funds between–or out of–any account using Mint. And neither can anyone else.
They had my bank passwords to download transaction data. They could do anything I could do at my bank web site. They promise not to, but to say they (or anyone who hacks their system) cannot move funds is just a lie.

So enough is enough.

I went into mad mode over the holiday break, first exploring a greasemonkey userscript:

@description Mint: I want my data back

I was pleased to find that  SQL support in GnuCash had matured as of the Dec 2010 release of version 2,4, and the SQL structure that gnucash uses is quite straightforward: accounts, transactions, splits, etc. Using guuids instead of integers for primary keys is somewhat novel but works OK. Note that the GnuCash string form of a uuid has no '-' characters, so in  mysql, I use replace(uuid(), '-', '').

I went back to the last comprehensive financial snapshot that I trusted, i.e. my last quarterly balance sheet from Quicken before the Mint experiment. I didn't load the decade+ of flat-file transaction data that lead up to that point, but I'm confident I could if I wanted to. For now, I just created an equity account for "Quicken transition" and used it to reproduce the balance sheet.

Since I didn't trust Mint to correctly enumerate transactions, I used OFX from my financial institutions to fill in the transaction information for the past year, reconciling statements as I went. (After getting the initial balance sync'd, reconciling statements was trivial, aside from glitches in my understanding of how GnuCash's OFX import UI worked.)

Then I sync'd the categorization info from Mint with GnuCash. While much of it was a one-time bulk import, running both systems in parallel for a short time was an important goal. This would require stable transaction identifiers from Mint, something they don't provide in their CSV export. While Mint doesn't advertise an API, fortunately, it was straightforward to reverse-engineer the way their Ajax client gets transaction data: mcc.py, my Mint cloud client, is only 100 lines of python.

I put some effort into trying to reproduce Mint's .csv export using my GnuCash database, but reached a point of diminishing returns. I do maintain the mint_re_export SQL view for version control purposes. I also discovered a version-control-friendly way to back up the whole mysql database:

$ mysqldump -u $LOGNAME --skip-dump-date --tab=$BAK_DIR -p $DB_NAME

Beware: mysqldump --tab defaults timezones to UTC but mysqlimport uses local time, with no TZ choice. The work around: set global timezone=UTC before mysqlimport.

Also, if you use Ubuntu, like I do, and you don't specifically authorize mysql to write there, apport will stop it and you'll get a mysterious: (Errcode: 13) when executing 'SELECT INTO OUTFILE' You need to edit /etc/apparmor.d/local/usr.sbin.mysqld and add a line /bak/dir/** rw, .

One of the real tests of the results is doing my 2011 tax return. So far, I haven't had to log back in to Mint, though I have worked around shortcomings in the GnuCash UI using hand-crafted SQL or grep on the .csv export from Mint a few times.

Highlights from the changelog include:

152:955de3fe6de7 2012-01-16 budget loads into gnucash DB
151:3945105a6728 2012-01-16 budget_sync.py groks my budget spreadsheet
145:d1854d4ef26c 2011-12-31 handle split transactions using mint parent/child info rather than guessing
144:bbd55121161f 2011-12-31 more straightforward account sync between mint and gnucash
141:48669cd9ec01 2011-12-30 oops; don't exclude the id column; that's the _whole point_!
140:76830ffc5cb2 2011-12-30 trx_explore supports mysql as well as sqlite; parses amount straightforwardly
139:42363cb4e1e0 2011-12-30 trx_explore with date handling loads thousands of mint transactions
137:dc26b4e483c6 2011-12-30 mint client fetches all transactions
135:503f9b7ef4af 2011-12-29 more matching work for credit cards
134:f9f7358da70c 2011-12-29 - incremental matching
133:2b463e2e73ff 2011-12-29 mint_re_export view is mostly working
130:023bbfc79025 2011-12-29 merge split transactions from mint into gnucash/OFX
129:f7301444308f 2011-12-29 updated OFX checking account data w.r.t. mint categorization work
127:2c0312b96f58 2011-12-28 figured out how to import mint accounts into gnucash DB
117:26fe6a26a345 2011-12-25 matching worked for 100 transactions (warnings/logging tamed)
110:1af3fef89487 2011-12-25 created SqlAlchemy object from JSON data
109:4fa21822a6c9 2011-12-24 explore gnucash sqlite file
106:80134e7a8730 2011-12-24 JSON dump of Mint transaction data
105:6d5bb42201c7 2011-12-24 got access to the transaction data
103:a4389fd1fcf6 2011-12-22 mint greasemonkey exploration (bookmarklet looks easier)

Syndicated 2012-02-13 19:08:00 (Updated 2012-02-14 03:26:41) from Dan Connolly

22 Jan 2012 (updated 13 Feb 2012 at 16:40 UTC) »

Remembering OS-9 on the CoCo

During an annual purge of old file boxes, I came across my 5 1/4 CoCo disks. Much of what I know about unix and linux actually dates back to OS-9 on the CoCo:
Even on the CoCo, a quite minimalist hardware platform, it was possible under OS-9/6809 Level One to have more than one interactive user running concurrently (for example, one on the console keyboard, another in the background, and perhaps a third interactively via a serial connection) as well as several other non-interactive processes. -- OS-9 - Wikipedia 
I wrote a shell in assembler; I ran across a hardcopy of the source a week or so ago. I wonder if the source is on these floppies. I made a copy on CD a few years back before I de-commissioned my last 5 1/4 disk drive.

Syndicated 2012-01-22 21:19:00 (Updated 2012-02-13 15:59:39) from Dan Connolly

There’s a Better Way to Build a Smart TV | The Official Roku Blog

This Roku Streaming Stick looks like a pretty good balance between the simplicity of integration and the upgradeability of componentization.
It makes me question my recent strategy of getting a really inexpensive TV (Haier L32D1120 32-Inch 720p LCD HDTV, Black on sale for $200) and streaming Blu-ray player (Panasonic DMP-BD75 Ultra-Fast Booting Blu-ray Disc Player $60). The Blu-ray player does Netflix pretty well, but the TV doesn't have the new MHL HDMI interface.

Syndicated 2012-01-21 08:34:00 (Updated 2012-01-21 08:34:59) from Dan Connolly

A big thanks for Web-iPhoto!

My wife does a photo shoot with the boys for the Christmas card each year. I wanted to share a digital copy of the photo, but our family photo archive is a mess, with N iPhoto albums on M macs and K backups on X linux boxes.

I know iPhoto is just JPG's and sqlite underneath, so it kills me that I can't just get at the photos with a web browser. I could code something up myself, but surely somebody has done it before, no? I've looked without luck before, but I guess I was using the wrong search terms. Today when I wished for "iphoto sqlite web server", lo! Merry Christmas to me!


Thank you, Dmytro Kovalov!

It works great on a huge iPhoto library backed up on this linux box.

Here's hoping I can install it on the macs in the house running various versions of OS X. I have lots of experience with python on macs, but not so much ruby. I sure hope I don't have to install XCode.

Syndicated 2012-01-03 00:23:00 (Updated 2012-01-03 00:27:03) from Dan Connolly

Capability Security in E, coffescript, python, dart, and scala

A couple months ago, I inherited some Java code and took on the task of fixing a bug in it. The bug turned out to be a consequence of a silent failure; eek! And there were precious few tests and no way to test the parts without being connected to LDAP servers and SQL databases and such. This started me on an exploration of current best practices in testing. And since the job of this code was policy enforcement around patient data, I could finally justify getting my hands dirty with capability-based security. I discovered, as many others have, that both testability and security are well served by some of the same basic object-oriented techniques.

Dependency injection frameworks always smelled like overkill to me, but after watching Miško Hevery on testability, I was convinced. If you're in the mood for text rather than video, see his Guide: Writing Testable Code. Basically, instead of having some policy enforcement object constructor call an LDAP connection constructor, the policy enforcement object takes the LDAP connection as a constructor argument. "Don't call us; we'll call you" is a handy mnemonic. This lets you substitute a mock LDAP connection for testing.

It also forms patterns of cooperation without vulnerability.

For example, take a look at the simple money example in E and the underlying sealer/unsealer pattern.

I have been using these as an exercise to explore some of the recent programming language developments:

The coffeescript translation seems completely natural, to me. Given the right static scope (i.e. without most of the JavaScript standard library), I think it has the same security properties as the E version. And the E idioms seemed to translate quite directly.

Python has not only the API authority issues, but also untold introspection loopholes. Plus, I had to kludge around read-only closures and no-assignment-in-lambdas; and while simulating E's method suite idiom is not too ugly, tools like pyflakes don't recognize the results.

Dart is a big disappointment. Everywhere else I look, Google is pushing capability security. But Dart lacks nested classes, so translating E method suites results in something that is only vaguely recognizable, let alone comprehensible.

Scala works reasonably well. The Java implementation of sealing relies more on  strong typing than the object graph for rights amplification; I might want to think that over some more. Also, It's a little boring to spell out the types. I might have to try it in Haskell. But on the other hand, as Brendan Eich observes:
Dynamic languages are popular in large part because programmers can keep types latent in the code, with type checking done imperfectly (yet often more quickly and expressively) in the programmers’ heads and unit tests, and therefore programmers can do more with less code writing in a dynamic language than they could using a static language.
The balance between static and dynamic languages also shows up in development tools. I had the eclipse with the Joe-E verifier, maven, and mercurial working all together at home one evening. The code really does just about write itself at that point. But when I tried to reproduce it at work, I got so frustrated that I retreated to emacs and python and looking up function arguments manually. The python version of the project has gotten complex enough that I'm starting to miss some of the whole-program consistency that Java tools give, but I'm getting by with a bottom-up approach: flymake, doctest, and the like.

Syndicated 2011-11-23 22:44:00 (Updated 2011-11-23 23:57:11) from Dan Connolly

Medical Informatics, Peer Review, and Open Access

Three issues of JAMIA just arrived, weighing not just on my desk but also on my mind: success is defined by my peers in my new field, medical informatics, as publication in a journal where the readers have to pay for access. After fifteen years as an Open Web advocate, this grates on me.

But I see that change is already underway. While JAMIA is the top journal that I hear about in the office so far, a quick trip to Wikipedia shows that it's second in impact to an open-access journal: Journal of Medical Internet Research.

Syndicated 2011-11-21 14:35:00 (Updated 2011-11-21 14:55:21) from Dan Connolly

99 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!