Older blog entries for apenwarr (starting at number 619)

py-monotime, CLOCK_MONOTONIC, and time.monotonic()

An increasingly common problem nowadays is caused by the fact that the time() or gettimeofday() system calls do not always return monotonically increasing values. That is, sometimes they go backwards, and sometimes they take giant leaps.

In Unix, thankfully, timezones don't affect this: you don't have to worry about, say, daylight savings time messing up your time calculations, as long as you use time_t or struct timeval instead of struct tm for any long-lived storage. (Example of not doing it right: if your log file includes the current local time of each message, then every autumn when daylight savings causes a one-hour backwards jump, you will have log messages that appear out of order in the file, and if you sort them by date, it'll be a lie.)

However, there are other things that do make a mess, because Unix time was not entirely well thought out when it was invented. The most commonly known problem is NTP, which jumps your clock around sometimes. People mostly try to ignore this by just assuming that after boot, NTP won't *jump* the time, it'll only *slew* the time (which it tries very hard to do), and this mostly works, so people get away with this assumption 99.9% of the time, and their program goes slightly bananas the other 0.1% of the time, after which they typically reboot and are happy.

There's another annoyingly niggly one though, which is virtually impossible to work around: leap seconds. As implemented in Unix, a leap second literally causes the same time_t to occur *twice*. That turns out to be most definitely the wrong answer; the right answer would have been to include leap seconds in the localtime() calculation, not in the kernel's implementation of time. But sadly, mistakes were made, and we have what we have. There are many stories in computer lore of programs that work great except at the exact moment of a leap second, at which time all sorts of things (notably multimedia timing algorithms) go completely wrong. Once again, though, rebooting fixes it.

Anyway, it turns out there is already a solution for all these problems, and it's been out there for a long time, but not well standardized. The solution is called a "monotonic clock." In a monotonic clock, absolute times aren't very meaningful (they are usually "seconds since the system booted" or something like that) but *relative* times, which are almost always what you care about, always do what you want. So if you say, "give me an event 60 seconds from now" then you schedule it for time monotonic() + 60, and life is simple and good, and you don't need any crazy hackery to make sure you deal correctly with backwards-flowing time.

If you want access to this lovely thing on a system compliant with POSIX.1-2001, such as Linux, what you want is the clock_gettime(CLOCK_MONOTONIC) system call. Unfortunately, non-Linux systems seem to largely not support it. On MacOS, you can use this advice from Apple instead.

What's worse, if you're writing in python, there is no access to monotonic time even on systems that *do* support it. Well, there is, starting in python 3.3 apparently (which adds a time.monotonic() function), but nobody uses python 3, so that doesn't really help most of us. For the rest of the world, I just made a new python module called monotime. If you import it, then time.monotonic() suddenly appears, and you can write your program to use it, just like if it were supported internally in your copy of python. It's licensed under the Apache 2.0 license. You can get it through pypi and I also added Debian packaging scripts.

I should also mention python-monotonic-time, which I didn't use for two reasons. First, it's under the GPLv3 (not the LGPL), which would infect any program that uses it. Secondly, it's written using python's ctypes, which is great for hackery, but is very brittle (it depends tightly on system-dependent library names and struct formats, without actually including the system header files during build) and much slower than a C module, which is what my monotime module uses. Programs that need monotonic time typically need a *lot* of monotonic time, and you want it to be fast.

For more time-related trivia than you can possibly imagine, check out python's PEP 418, which introduced the time.monotonic() function and a few other things.

By the way, getting approval to release this code through Google's super-streamlined open source releasing process took only 47 (monotonic) minutes and almost no CPU time.

Syndicated 2012-08-10 22:27:53 from apenwarr - Business is Programming

port.py, portsh, and py-remoteexec

Although I haven't been posting a lot here in the last few months, and my various open source projects have been a little quiet (sorry), I am certainly not idle. I've been working on lots of cool stuff, and some of it I think you'll actually get to see eventually.

In the meantime, here's some work that's based on some of my already-released work. Back in September 2010, I wrote about traffic shaping on a Sheevaplug and introduced my little toy script for talking to serial ports, port.py. It's like minicom, but it doesn't try to re-emulate vt100 (yikes!) and you can understand the source code. Since then I've improved it a bit, adding lockfile support and transmit rate limiting, and making it a bit more modular. Oh, and it supports sending BREAK signals now. I used minicom for years, but well, now I don't!

By rearranging those modules, a couple weeks ago I wrote portsh, a program for automatically running arbitrary commands on a remote machine via a serial port. Let's say you've got a Sheevaplug, or one of the many random embedded devices available nowadays, and you want to run some automated tests (for example) on the device. Part of the test is to reconfigure the network interfaces, and if you do that, an ssh session (for example) would be disconnected, so the most reliable way to control the test is via the serial port. Sadly, using a serial port has a lot of problems that don't matter for interactive use, but which matter a lot when trying to automate stuff:

  • It's slow (usually 115kbps or less).
  • It doesn't separate stdout and stderr.
  • It isn't 8-bit binary clean by default (for example, \r is translated automatically to \n in the default tty "cooked" mode).
  • When you do set it to a binary-clean tty mode, you get into trouble if your script ever aborts halfway and you want to restart it (ctrl-c is disabled!)
  • You can only have one session at a time.
portsh is designed to solve most of those problems. (Currently it doesn't really try to solve the "one session at a time" problem, but it's one step toward a possible solution.) How does it work?

Well, the idea is to start a program on the remote end of the connection that runs a non-binary protocol capable of carrying binary data. For these purposes, we use base64 encoding, which is a bit wasteful but well-defined. To offset some of the wasted bytes, we also gzip all the data before sending it through, so for some kinds of data (copies of log files, long 'ps' or file lists) the net result is faster than a raw serial port.

The portsh command line is intended to work like ssh with a command line:

   portsh ttyUSB0 ps ax

...which gives a clue about why the separation of stdout and stderr is so important. Programs like bup and sshuttle rely on that separation in order to work.

Now, as it happens, so far I haven't tried to run bup or sshuttle over a portsh connection (but I think it would work). I did, however, send binary data successfully:

   tar -cf - . | portsh ttyS0 'cd /tmp && mkdir -p foo && cd foo && tar -xf -'

(Yes, the command is just passed verbatim to system(), and thus you can't avoid the shell mangling its escape sequences. That's pretty bad for security/predictability - especially with filenames containing spaces - but I wanted to be really compatible with ssh, and sadly that's what ssh does.)

I've also successfully used David Anderson's py-remoteexec (that's Mercurial; see my py-remoteexec clone on github) through portsh to upload and run arbitrary python files on the remote end, which is where things can really get interesting.

Historical trivia: py-remoteexec is actually based on my upload yourself for fun and profit code from sshuttle, but cleaned up and generalized so it's easy to use in your own projects. Then I stole back the py-remoteexec 1st and 2nd stage assemblers, modifying them for non-binary-clean serial port behaviour, and that's what became portsh. So if you run py-remoteexec over portsh, you're actually running *two* levels of python remote script assembly, both of which are derived from sshuttle.

Confused yet? Don't fret. Exactly how it works isn't that important, unless you're amused by such things, in which case you're best to just view the portsh.py source. It's pretty readable, if you're crazy. But if you, like most people, don't care how it works, all you need to know is you clone my repository, and run portsh, and it gives you something like ssh-over-serial-port semantics. And because it uploads itself, all you need on the remote machine is python - you don't need to install any other tools.

Enjoy!

Syndicated 2012-07-06 23:24:27 from apenwarr - Business is Programming

9 May 2012 (updated 9 May 2012 at 19:02 UTC) »

TCP doesn't suck, and all the proposed bufferbloat fixes are identical

Background: a not very clear article in ACM Queue led to a post by Bram Cohen claiming TCP sucks.

The first article is long and seems technically correct, although in my opinion it over-emphasizes unnecessary details and under-emphasizes some really key points. The second article then proceeds to misunderstand many of those key points and draw invalid conclusions, while attempting to argue in favour of a solution (uTP) that is actually a good idea. So I'm writing this, I suppose, to refute the second article in order to better support its thesis. That makes sense, right? No? Well, I'm doing it anyway.

First of all, the main problem we're talking about here, "bufferbloat," is actually two problems that we'd better separate. To oversimplify only a little, problem #1 is oversized upstream queues in your cable modem or DSL router. Problem #2 is oversized queues everywhere else on the Internet.

The truth is, for almost everyone reading this, you don't care even a little bit about problem #2. It isn't what makes your Internet slow. If you're running an Internet backbone, maybe you care because you can save money, in which case, go hire a consultant to figure out how to fine tune your overpriced core routers. Jim Gettys and others are on a crusade to get more ISPs to do this, which I applaud, but that doesn't change the fact that it's irrelevant to me and you because it isn't causing our actual problem. (Van Jacobson points this out a couple of times in the course of the first article, but one gets the impression nobody is listening. I guess "the Internet might collapse" makes a more exciting article.)

What I want to concentrate on is problem #1, which actually affects you and which you have some control over. The second article, although it doesn't say so, is also focused on that. The reason we care about that problem is that it's the one that makes your Internet slow when you're uploading stuff. For example, when you're running (non-uTP) BitTorrent.

This is where I have to eviscerate the second article (which happens to be by the original BitTorrent guy) a little. I'll start by agreeing with his main point: uTP, used by modern BitTorrent implementations, really is a very good, very pragmatic, very functional, already-works-right-now way to work around those oversized buffers in your DSL/cable modem. If all your uploads use uTP, it doesn't matter how oversized the buffers are in your modem, because they won't fill up, and life will be shiny.

The problem is, uTP is a point solution that only solves one problem, namely creating a low-priority uplink suitable for bulk, non-time-sensitive uploads that intentionally give way to higher priority stuff. If I'm videoconferencing, I sure do want my BitTorrent to scale itself back, even down to zero, in favour of my video and audio. If I'm waiting for my browser to upload a file attachment to Gmail, I want that to win too, because I'm waiting for it synchronously before I can get any more work done. In fact, if me and my next-door neighbour are sharing part of the same Internet link, I want my BitTorrent to scale itself back even to help out his Gmail upload, in the hope that he'll do the same for me (automatically of course) when the time comes. uTP does all that. But for exactly that reason, it's no good for my Gmail upload or my ssh sessions or my random web browsing. If I used uTP for all those things, then they'd all have the same priority as BitTorrent, which would defeat the purpose.

That gives us a clue to the problem in Cohen's article: he's really disregarding how different protocols interoperate on the Internet. (He discounts this as "But game theory!" as if using sarcasm quotes would make game theory stop predicting inconvenient truths.) uTP was *designed* to interact well with TCP. It was also designed for a world with oversized buffers. TCP, of course, also interacts well with TCP, but it never considered bufferbloat, which didn't exist at the time. Our bufferbloat problems - at least, the thing that turns bufferbloat from an observation into a problem - come down to just that: they couldn't design for it, because it didn't exist.

Oddly enough, fixing TCP to work around bufferbloat is pretty easy. The solution is "latency-based TCP congestion control," the most famous implementation of which is TCP Vegas. Sadly, when you run it or one of its even better successors, you soon find out that old-style TCP always wins, just like it always wins over uTP, and for exactly the same reason. That means, essentially, that if anyone on the Internet is sharing bandwidth with you (they are), and they're running traditional-style TCP (virtually everyone is), then TCP Vegas and its friends make you a sucker with low speeds. Nobody wants to be a sucker. (This is the game theory part.) So you don't want to run latency-based TCP unless everyone else does first.

If you're Bram Cohen, you decide this state of affairs "sucks" and try to single-handedly convince everyone on the Internet to simultaneously upgrade their TCP stack (or replace it with uTP; same undeniable improvement, same difficulty). If you co-invented the Internet, you probably gave up on that idea in the 1970's or so, and are thinking a little more pragmatically. That's where RED (and its punny successors like BLUE) come in.

Now RED, as originally described, is supposed to run on the actual routers with the actual queues. As long as you know the uplink bandwidth (which your modem does know, outside annoyingly variable things like wireless), you can fairly easily tune the RED algorithm to an appropriate goal queue length and off you go.

By the way, a common misconception about RED, one which VJ briefly tried to dispel in the first article ("mark or drop it") but which is again misconstrued in Cohen's article, is that if you use traditional TCP plus RED queuing, you will still necessarily have packet loss. Not true. The clever thing about RED is you start managing your queue before it's full, which means you don't have to drop packets at all - you can just add a note before forwarding that says, "If I weren't being so nice to you right now, I would have dropped this," which tells the associated TCP session to slow down, just like a dropped packet would have, without the inconvenience of actually dropping the packet. This technique is called ECN (explicit congestion notification), and it's incidentally disabled on most computers right now because of a tiny minority of servers/routers that still explode when you try to use it. That sucks, for sure, but it's not because of TCP, it's because of poorly-written software. That software will be replaced eventually. I assure you, fixing ECN is a subset of replacing the TCP implementation for every host on the Internet, so I know which one will happen sooner.

(By the way, complaints about packet dropping are almost always a red herring. The whole internet depends on packet dropping, and it always has, and it works fine. The only time it's a problem is with super-low-speed interactive connections like ssh, where the wrong pattern of dropped packets can cause ugly multi-second delays even on otherwise low-latency links. ECN solves that, but most people don't use ssh, so they don't care, so ECN ends up being a low priority. If you're using ssh on a lossy link, though, try enabling ECN.)

The other interesting thing about RED, somewhat glossed over in the first article, is VJ's apology for mis-identifying the best way to tune it. ("...it turns out there's nothing that can be learned from the average queue size.") His new recommendation is to "look at the time you've had a queue above threshold," where the threshold is defined as the long-term observed minimum delay. That sounds a little complicated, but let me paraphrase: if the delay goes up, you want to shrink the queue. Obviously.

To shrink the queue, you "mark or drop" packets using RED (or some improved variant).

When you mark or drop packets, TCP slows down, reducing the queue size.

In other words, you just implemented latency-based TCP. Or uTP, which is really just the same thing again, at the application layer.

There's a subtle difference though. With this kind of latency-self-tuning RED, you can implement it at the bottleneck and it turns all TCP into latency-sensitive TCP. You no longer depend on everyone on the Internet upgrading at once; they can all keep using traditional TCP, but if they're going through a bottleneck with this modern form of RED, that bottleneck will magically keep its latencies low and sharing fair.

Phew. Okay, in summary:

  • If you can convince everybody on the internet to upgrade, use latency-sensitive TCP. (Bram Cohen)
  • Else If you can control your router firmware, use RED or BLUE. (Jim Gettys and Van Jacobson)
  • Else If you can control your app, use uTP for bulk uploads. (Bram Cohen)
  • Else You have irreconcilable bufferbloat.
All of the above are the same solution, implemented at different levels. Doing any of them solves your problem. Doing more than one is perfectly fine. Feel happy that multiple Internet superheroes are solving the problem from multiple angles.

Or, tl;dr:

  • Yes. If you use BitTorrent, enable uTP and mostly you'll be fine.
Update 2012/05/09: Paddy Ganti sent me a link to Controlling Queue Delay, May 6, 2012, a much more detailed and interesting ACM Queue article talking about a new CoDel algorithm as an alternative to RED. It's by Kathleen Nichols and Van Jacobson and uses a target queue latency of 5ms on variable-speed links. Seems like pretty exciting stuff.

Syndicated 2012-05-09 03:49:06 (Updated 2012-05-09 19:02:00) from apenwarr - Business is Programming

A profitable, growing, useful, legal, well-loved... failure

Since before graduating from university and up until taking my current job (which is its own story I'll tell some other time), I've initiated several things that could be called startups. That is, we incorporated companies, we had a small number of people that got paid wages, we collected Canada SR&ED tax credits. Every one of these startups turned a profit. More than one had outside financing. One of them we sold to IBM.

I'm telling you this not to show off, but as a setup for the rest of this story. What I want to explain is that I fail strangely. Or at least, it feels like I do. Maybe it's not so strange; maybe you should just go read Paul Graham's How Not to Die article, where he advises us that "Startups rarely die in mid keystroke. So keep typing!"

Because that's really the moral of this story; or maybe it isn't. Maybe this story is about how that advice hasn't actually worked for me, because inside each of those successes is a story of failure. It's interesting, because for any of the companies I've started, by leaving out some details I can honestly make them sound like resounding successes or resounding messes. If I include all the details, then, well they're just confusing. So you'll usually hear just one side or the other, depending what point I'm trying to make.

Today I'll tell you both sides though, for just one of those companies. I'm not going to name the company here but it's still alive, it's still making money, my co-founder is still working his butt off to keep it from falling over. Given the details I'm about to share, it's trivially easy to find the company name with a little Googling, and I encourage you to do so. I just don't want to name it here because I really don't want this article to be the first one that comes up when you Google it. (This diary has way more Google Juice than the company does... though the company has a much more profitable sales funnel.)

So anyway, here's what happened. We started the company back in 2008. We wanted to do something in the world of databases, because we figured databases were ripe for disruption, what with SQL being SO VERY SUCKY in so many ways. We wanted to create a new variant of SQL based on the analogy that (our new thing) is to SQL as C is to assembly language. That is, C is little more than a portable assembly language. So we need a portable version of SQL. (If you've used more than one SQL variant, you know the analogy is apt.) Oh, and maybe we'll throw in functions and variable assignment and loop control structures while we're there. Yeah, I know, crazy. But if you've written stored procedures in MS SQL, those are the things you know you need.

Why did we want the C of database query languages, instead of something modern, like the python of database query languages? We thought this was the clever part of the analogy: it's because people already *tried* the high-level query languages. They're called ORMs (object relational mappings), and sure enough, they're just like high-level languages were in 1975: slow, bloated, wasteful, unreliable, non-portable, and nobody can agree which one is best. C changed all that. Sure, there were non-portable features in C (there still are), but dammit, + was just always +, and for loops were for loops, and the world made one big step forward. People still use C today. High level languages are much better now, but they're almost all still built on top of C. How much better could the world be if we could do that for SQL?

Anyway, that seemed really hard, and we were just two guys who wanted to get a minimal product launched in, say, 4 months. So we decided to trim down the idea. What's the minimal idea that will get us in that direction, but with a product in 4 months? Well, first of all, to invent C you don't need multiple assembly language variants; you just need one to start with. Let's pick one. Why not the simplest one we can find? A bit of searching around revealed the obvious candidate: Microsoft Access. It's even dumber than MySQL.

Okay then, what will we build on top of Access? Well, we want to make a portable, slightly-higher-level query language. What will be its initial use case? Forgetting about other databases for now, what do Access developers need most? ... Ah, to publish their data on the web, of course. Access totally sucks for web development. (Even now it does. They keep claiming to have finally added web support; Access 2002 had web support. But it's nearly useless every single time. Still is.)

So we would write code to let you easily query Access tables using web tools, like AJAX or json or whatever. Excellent, that justifies writing our query parser, but it doesn't have to be feature-complete on day 1. We can add more database engine plugins later. We can get a few customers, launch, and iterate. Perfect!

Just one little problem. You have to actually get that data to the web server. The reason Access sucks for web apps is Access databases are a single .mdb file on your desktop machine. Multi-user access means multiple clients accessing the .mdb file using a samba file share. (People do this with dozens of users at a time. It works.) But how do you get the data onto the web?

Well, the .mdb file format is undocumented. Reverse-engineering it will take forever. So we'll write a plugin for Access, that reads through your data, exports it to text, and uploads it to our server. That turned out to be a fair bit of work, of course, but whatever, I do love replicating data, and we figured the ability to replicate SQL databases could be a big deal, so it's certainly not a waste of time. (Trivia: Access has also supposedly had database replication features since, I think, Access 2000. Too bad it doesn't work ON THE INTERNET.)

Once we were well under way writing the replication system, we thought about it some more and realized that the minimal product for our 4-month launch target didn't have to include a query language at all; just replicating the databases was surely enough to please some user somewhere, as long as it would sync in two directions. Ta da, Internet-enabled Access replication! So we stopped after writing only the barest minimum query parser. (To this day you can still export your tables and search them using json queries; it's pretty cool, but we haven't done any more work on the query engine.)

We got the basic Access web replication engine working (which was a huge amount of work, don't get me wrong, and the code is singularly awesome, but I'm going to skip over it here). We gave it a convincing-sounding version number with the word BETA in it, put it up on a web site I designed with my super lame web design skills, and waited for the world to beat a path to our door.

Okay, you know how this goes, right? You can't just do that. Nobody will come.

Well, this time you're wrong. People came. We had stumbled into a huge unsolved problem and unaddressed market. There are lots, and lots, and lots, and lots of legacy Access databases in places you don't even want to think about. If you find our web site and go to Testimonials and scroll to the bottom, you'll see what I mean. The actual CIO of a huge pharmaceutical company called us out of the blue and asked us to solve their problem because they have thousands of Access databases they want to share across their tens of thousands of seats.

But I'm jumping ahead of myself. Not all those people called us on day 1. On day 1, our website sucked, because it was talking about Access Replication.

And what the bloody hell is replication? Most Access users with Access problems didn't have a clue. They certainly weren't searching for it.

That didn't stop some of them from finding us and calling anyway. See, we also had a couple of pages talking about our query engine, and they contained phrases like "Access on the Web." Turned out a lot of people were searching for that. They still are. Microsoft caught on with Access 2010 and marketed the heck out of that search phrase, so if you search for it now, you'll find them and not us. Which is funny, because Access 2010 is still basically useless for the web. But it shows what marketing dollars can do.

Now, I'm badmouthing Access 2010 a lot here, but here's how I know it's useless: because people keep on clicking, and searching, and I don't even know what keywords they search on anymore, and they find us. They use Access 2010. They're not dumb, they're real programmers, they know what features Access 2010 has. Even if they were dumb, God knows Microsoft has marketed them to death. And these people still want to pay us to put Access on the web.

Anyway, I've gotten ahead of myself again. The important part of the story is, we had a web site all about Access replication, and nobody had any clue what we were talking about, but they called and emailed and the message was clear: We want Access on the web. How much money can we pay you to provide it?

Um, well, look, the on-the-web part is kind of sucky and...

...and the customer is always right. So, back to the drawing board. One day, a customer called me and explained his very specific and immediate problem. He had just billed a customer many thousands of dollars over many months to build a custom Access application. Right at the end, the customer said they were happy. Now... he should just publish it on the web and they'll be done.

Oh. Crap. The guy was really in trouble. Serious trouble. They hadn't specified the requirement up front; he was an Access-only developer, so he couldn't rewrite it. Even if he knew how, it would be months more work. (People complain about Access, but it's still, in my opinion, the absolute fastest way in the universe to make powerful database-driven apps. Way faster than Ruby on Rails, and you don't even have to be able to code. I mean it. But... not on the web.)

So he had a serious problem, and let me tell you, our 5%-finished json query language was not going to solve it. Neither was "replication." But that day on the phone, we came up with an idea.

What if we could run Access on our servers and display it over VNC in a web browser? What if we ran Access under Wine on Linux so we could squeeze more instances onto a single box? What if changes to the database in these VNC sessions could be replicated back down to your desktop copy of Access using our plugin?

What if, indeed. Turns out there's a cool program called Flashlight-VNC that's an implementation of VNC in flash, which runs in virtually any web browser (this was before there was an iPad or Apple dropped Flash out of Safari). Turns out recent versions of Wine can actually run some versions of Access. Turns out... well, let's just say it worked. And that, my friends, is the product we have today, more or less. Sure, since then we've added performance optimizations, reliability improvements. We store the database contents in git and use a custom merge algorithm for resolving changes made while in disconnected mode. (It's neat; git can store the whole revision history in less space than the original .mdb.) But fundamentally, that's the product.

And people want it. No, I take that back; the product is a magnificent heap upon heaps of insane hackery. I mean, we are running Access in Wine in X11 on Linux in an isolated user account on our server slice that revision controls your Access database in git, and we're displaying it using VNC in your web browser in flash. People can't possibly want that. But they need it. Which is better.

That's the other neat thing. They need it, because nobody else has ever created something like this. I don't think anybody ever will. I mean, how many people know Linux, Flash, C++ (for the plugin), python (for the server), and Microsoft Access, of all things, and are willing to combine them all with a healthy knowledge of streaming network protocols and database replication? And even if you could find a whacko like that, would that person be willing to enter the market, starting from scratch, knowing someone else got there first?

Every month, we have more revenue. And our costs are tiny, so that means more profit.

Customers need this so badly that they're willing to pay a lot for it. Like $35/user/month/database, for the basic plan. In case you're counting, in a year, that's much more than a copy of Access. And just to be safe, because we want to avoid lawyers, we tell customers to make sure all their users already have an Access license on their desktop (in addition to the legally required ones we have for our servers). This isn't so bad; turns out big companies - the kind with lots of Access databases - pretty much all buy Microsoft Office Professional for everybody anyway, so they all have Access. So no, in case you were wondering, our business model is not about cheating on Access licensing. If anything, people are buying more licenses than they strictly need, and I don't feel like getting on Microsoft's bad side, and neither do they, so everybody wins.

No, it's not about cheating. It's just about providing something people want and are willing to pay for. What do they want? They want to not rewrite legacy apps. Please, please, let us just keep running the app we spent the last 10 years building, but let us run it outside our office, because we all have laptops now.

How much money will people pay to keep their app going? About as much as the cost of rewriting it in a web language. More, even, since it lowers their risk. You do the math. As a bonus, it's a small monthly expense, not a big capital expenditure.

And yes, every month, our profit is more than the last one.

...

But all that was the good news.

I've already given you a hint about the bad news. Remember when I asked what whacko, with all those skills, would want to do this? I now know one of the answers, and it's OH GOD NOT ME. Eventually I realized that there is no windfall big enough to rationalize spending 3-5 years of my life, working full time, writing compatibility layers for Microsoft Access. Where, in the ideal world, if we were successful, my days would involve on-site visits to huge bureaucratic companies of the sort that... well, let's be honest. The sort that would run mission critical Access databases.

Really, on a rational level, I know that's unfair. I know these are good people. I think Access developers are great, actually. I love the fact that they know a good thing when they see it. Access *is* the easiest, most rapid of rapid development environments I've ever seen. I think almost all database developers have terrible taste, because they can use Access and compare it to, say, MS SQL, and not see what makes Access great and MS SQL suck, even while they know perfectly well the development in MS SQL + C# or Java will take something like 10x as many man-hours. For some apps, it's worth it for the higher quality; for a random internal business process app, it's not, but people spend it anyway because they "heard Access isn't industrial strength."

So don't get me wrong. I like Access users. Access developers, in particular, are the anti-IT department, the rebels, the people who aren't willing to wait for the sysadmins to provision them a server, and they don't have to, because they can just share an Access file on the fileserver. IT departments hate them, which is how I know they're on to something. These are the kind of people I want to help. This is the sort of thing that's the reason I do the work that I do. No kidding.

But, Lord, no, don't make me actually code Access plugins. Don't make me work with Windows anymore. Just don't.

God. It's so lame when I write it down. Actually, it's been lame for months, every time I even think it. I can't believe I have that kind of lack of follow-through. I don't want to think that about myself. It's a travesty. A terrible embarrassment. Something that makes me question my self-worth. If I can't take something that's so obviously working, and milk it for all it's worth, then what kind of human am I, anyway? I think I suck at capitalism. Maybe that's it.

You know the truth? I don't know. I just don't know. I am a completely irrational human being, and I hate it, but deep inside me there's a voice that just says, "No. Get the hell out. If you continue doing this, you will die."

So I got the hell out. I "stopped typing," as Paul Graham might say. Nowadays I have a pretty great "real job" where I can spend all night hacking the Linux kernel, programming embedded systems, and working on highly parallel build systems. And even though the potential upside is much less, I like it. For now, at least. I'm happy.

And that's my failure. Every day, my co-founder keeps working away, keeping the systems running with as little effort as he can spare. He's got a day job now, for various reasons; among them, he's an extravert, he needs co-workers. I still own half the shares, but I told him to keep the operating profits; the least I could offer, literally, I guess. That huge pharma deal is still in the pipeline and needs another callback, but there's nobody willing to do it. We don't optimize the web site for Google anymore; we haven't updated the news page since 2010; even I can't find our site in Google using any generic keywords. But I guess I'm not looking hard enough, because new customers still find it, sign up, and subscribe. Virtually nobody ever cancels once they've started. There is no competition. Nothing to switch to. There never will be. Where would they go if they stopped?

I know I've let my co-founder down. If the company would just die - if it would only be so simple, and nobody would want the product, or the users got angry at us and quit, or it were impossible to run it at a profit and we finally ran out of cash - then stopping would be easy. But no. They love it instead. They need it. There's an opportunity cost in continuing, but there's a sentimental cost in shutting it down - to say nothing of the users who have no other options.

In short, I learned that I don't have what it takes. Someone probably does, now that the actual insane part has already been invented, but I don't know who.

What would you do?

Syndicated 2012-03-24 22:53:34 from apenwarr - Business is Programming

29 Feb 2012 (updated 23 Mar 2012 at 10:07 UTC) »

Another Bold iPad Prediction

Update 2012/03/09: Darn, I guessed wrong. Oh well, a 50% hit rate for a bool is pretty good, right?

...

After I was declared a "semiconductor industry veteran" based on my previous article predicting when the iPad will get a retina display, I suppose I have to try my luck again.

Today there was a press release from Apple that essentially announces a new iPad, but we don't yet know what its killer feature will be. Retina display, you think?

I don't think so. A key part of my analysis last time was that right up to the release of the iPhone 4, there had been a series of Android phones with ever-increasing pixel density around the iPhone's screen size. The iPhone 4's screen density was not at all revolutionary: it exactly coincided with the density trends based on Moore's Law. Apple was not *ahead* of technology; they were right on time. Or, the day before the iPhone 4 was released, they were actually a year or two late, because all the other phones had better screens. They were waiting until they could do a 2x density increase.

Samsung just announced new Galaxy Note and Galaxy Tab tablets at 10.1 inches, but they're only 142 dpi or so, which is "normal res," not "retina." It's slightly higher res than an iPad, but not much. That's what state of the art looks like, and a "retina" density 10 inch display isn't reasonable yet.

Since the 2x resolution increase on the iPhone was so incredibly gobsmackingly successful, it's unlikely Apple will make an increase in pixels on the iPad unless it's 2x.

However, if you predict using Moore's Law since the date of the iPhone 4 release, it looks like you *could* perhaps double the 1024x768 iPad resolution to 2048x1536... you just couldn't do it yet on a 10 inch screen. Let's do the math: the iPhone 4 came out in June 2010, about 20 months ago, which gives us a Moore's Law improvement of 2**(20/18) = 2.16. The iPhone 4 screen is 960 pixels long, and 2.16 * 960 = 2073. Startlingly close to 2048. The only catch is it would be more like 6 or 7 inches, not 10.

I'm not sure Apple would be willing to go with a "smaller, lighter, thinner, more compact" iPad - the current one is maybe just the right size - but I don't know for sure. A 7 inch tablet would be a great book reader, I suppose. So I predict it's either that, or the screen resolution doesn't change.

It's no fun unless I say it out loud. Let the games begin!

(By the way, this is personal opinion, not that of my employer. I work on projects that are not smartphone or tablet or Android development, and I have no insider information.)

(And yes, this is the same prediction I made last time, because math doesn't change. I'm just reiterating it :))

(Also, this analysis omits the possibility of using a lower-density process than the iPhone 4 uses, thus producing the same pixels at a bigger size, maybe 10 inches again. But I don't think this is going to happen; you can see people pushing for bigger and bigger high-density phones, but I haven't heard about any high-density tablets yet. Apple has been strictly a follower here; they don't invent entirely new display fabrication technologies.)

Syndicated 2012-02-29 00:00:35 (Updated 2012-03-23 10:07:39) from apenwarr - Business is Programming

6 Feb 2012 (updated 6 Feb 2012 at 03:06 UTC) »

What is the definition of success?

A product manager asked me a great question at work the other day:

    "How do we define success for our project?"

He said he had asked someone else that question, and the slightly scary answer was, "I don't know." I said that's crazy, the answer is easy. Success is getting our product working and into the hands of real users who actually want it. Right?

But a few days later, I've thought about it some more, and I don't like that definition after all. Here's my new version:

    Success is making a product better than anybody else's product, that users absolutely love, and that's able to sustain itself indefinitely.

That definition is a real challenge, something that you'd be proud to live up to. It's something that deserves to be called "success." Unfortunately, using that definition, I've never been successful (unless you count wvdial).

You might be dismayed by the first criteria. A product better than anybody else's product? It seems obvious that most products can't be the best, but I think there's more room at the top than you think. There's no single ranking system from 1 to 10. Every user judges differently, so if you satisfy one person better than any other product, then you win, in some small way. And if you can be successful once, you can expand that success a bit at a time.

Users loving your product is kind of cliche, since Apple has made that philosophy famous. But as obvious as it now is, most developers still don't aim for it. If they did, you'd know.

By the way, your product can be the best in the world without users actually loving it. That's why I listed it explicitly. Before Apple, few people loved their phones or laptops or music players, but surely one of them was still the best - the best of a bad lot. Lovability is orthogonal to usefulness.

The last point - sustaining itself indefinitely - is where I've had trouble. At NITI, we made ridiculously awesome products that were quite obviously the best out there, if I do say so myself. And our users loved the heck out of them. But in the end it didn't sustain. (IBM bought the company and eventually killed the product.) The lack of sustainability was less about the product and more about organizational issues, but that's no excuse; it just tells you that the organizational issues are part of the product, whether you like it or not.

Sustainability can be about money (if you make enough money, you can afford to pay people to maintain it indefinitely) or quality (if, like wvdial, it's so good that you don't need to maintain it) or love (if developers are willing to work for lower or zero wages). But however you aim to achieve sustainability, it matters. You can have the greatest, loveliest product in the world, but if it dies, then you've failed.

That definition of success is my new benchmark for everything I do. If I can't see a path to success, based on those three criteria, then I shouldn't be wasting my time. If you notice me wasting my time anyway, please club me over the head as a reminder.

(Also, the project should be something I actually enjoy working on. I've made that mistake too. I don't think enjoyment is necessarily part of the definition of a successful project, but a "happy life" is also a nice-to-have :))

Syndicated 2012-02-06 01:57:36 (Updated 2012-02-06 03:06:01) from apenwarr - Business is Programming

Stuff I said at Kansas City StartupWeekend that sounded smart

I rarely get the chance to try out words of wisdom on real people before I present them to you here. So when I post something, it might turn out to be a dud, or pure gold, and I never know which. Not this time! This time you get pure, unadulterated, gold-coloured brilliance.

1. People miss the point of the "minimum viable product" (MVP... no, the *other* MVP) for startups. It does *not* mean, "release the first version with less features and then add more features later." No, we want a *minimum* viable product. The absolutely smallest set of features needed in order to get useful market information. How many features is that? Usually... zero. An MVP can be just a slide presentation, a sales pitch, a web site, a Google ad, or a customer conversation. The best MVPs let you objectively measure customer response *fast* and then tweak. One quick way to start is to make a web site that *claims* to offer the product you'd eventually want to build, and then gives a signup form, and then (oops!) crashes when people try to buy it (or sign up). Then make some web ads to send people there based on certain keywords. No, *not* a page that says "Coming Soon!" and asks for an email address. You want a real, live, signup page for what looks like a real, live product. You can add the "it works" feature later. In the meantime, since your MVP is so cheap and fast to build, you can try lots of different ones, add and remove advertised features, and see how that changes user responses. Once you have some input like that, you can make something slightly less minimal. Doing an MVP this way requires incredible self-control. Most people fail.

2. Speaking of terminology, "pivot" is misused too. People seem to think pivot is a happy-sounding word for "give up and do something different." But it's not. It has a very specific meaning based on very specific imagery. If you're running down the street, you have momentum. If you then plant one foot hard in the ground in front of you and turn, you can actually redirect that momentum in a new direction. *That* is what we mean by "pivot." When you give up and start over, you lose your position and all your momentum. But when you pivot, you keep all the stuff that's working, and you keep going from where you were before, but in a new direction. You have the same team, the same money, the same corporation, the same already-built features, and (hopefully) the same users as you did before. You use what you've already have in order to head somewhere new. Most importantly, the energy lost during a pivot is proportional to the angle of your pivot. If you only rotate by a little, you only waste a bit of your momentum. If you turn around 180 degrees, then your progress so far is actually an impediment - like when you've gone way into debt working on one idea, then start to pursue a totally different one. Pivoting is the art of choosing small rotations that let you maintain most of your speed and take advantage of your current position, while still admitting you've been running in the wrong direction.

3. No startup ever actually does what they thought they would do on day 1. Everybody pivots. "Except [company x]," said one person, "They're doing exactly what they planned." "Are they profitable?" I asked. "No." "Oh, then they just haven't pivoted *yet*."

4. The definition of a market niche. This is one of the most important lessons I learned from reading "Crossing the Chasm." It has a somewhat complicated definition of a niche, but since then I've had a lot of luck just taking the gist, roughly: If you can name a conference attended by a particular group of people, that group is a market niche. If there isn't such a conference, it's almost certainly not a niche. For example, let's say you were making a web site to help people find a lawyer. "People looking for lawyers" is a market segment, right? Wrong. There's no "I'm looking for a lawyer" conference. Lawyers are probably a market segment (although arguably, not *all* types of lawyers go to the same conferences). But *everybody* needs a lawyer eventually, and that's not a niche, that's everybody. "Startups who need lawyers" (lots of startups need lawyers and go to the same conferences, eg. StartupWeekend) are a market segment, as are building contractors and organized crime lords. Maybe you can help *them* find lawyers.

5. Your competition is whatever customers would do if you didn't exist. Let's say you're making software for producing cool graphs of statistical data. There's already really powerful software that does this, but nobody in your market segment uses it for some reason; maybe it's too hard to use or too expensive. That software is your competitor, right? Wrong! That software is irrelevant. Your customers don't want it, so even if it's competing with you, it's already lost. Your customers are probably using either Microsoft Excel's horrible chart features, or giving up and just not making charts at all. So your competitors are Microsoft and apathy, respectively. Apathy is probably going to be the tougher one. To find your list of competitors, just ask yourself what options your customers think they're choosing between. Ignore everything else.

Bonus: When presenting at a StartupWeekend-type conference... remember that the judges see a lot of businesses, and they're expecting you to have a business plan (or at least an idea of your target market and where you'll get revenue from). However, like I said in #3 above, no startup ever actually does what they originally set out to do. The judges all know that too. So your business plan is kind of a farce, and they know it, but if you don't have one, you look unprepared. So I suggest this: have a "grand scheme" and an "ideal first customer." Present them both, and where revenue comes from in both cases. Admit outright that your grand scheme will probably turn out to be wrong, and your real first customer might not be exactly like your ideal one. Basically, prove that you care about business, but you know you have to be flexible, and you're not scared of it. For a team two days into a new startup, that's all anyone can hope for.

Syndicated 2011-11-08 10:03:37 from apenwarr - Business is Programming

Avery @ StartupWeekend Kansas City, Nov 12-13

I'm planning to hang out next weekend at StartupWeekend Kansas City. I won't be starting any startups this time around, but if you're a faithful reader of my diary that hasn't unsubscribed out of boredom and you live in Kansas, let me know and maybe we can say hello while I'm in town.

Kansas City, by the way, is the site of the first major installation of Google Fiber, a project that I've been occasionally contributing to in my copious spare time.

My wife hastens to point out that it is not, however, the setting of The Wizard of Oz. Who knew Kansas City was in Missouri?

Syndicated 2011-11-04 13:49:46 from apenwarr - Business is Programming

2 Nov 2011 (updated 5 Nov 2011 at 01:01 UTC) »

bup.options

optspec = """
bup save [-tc] [-n name] <filenames...>
--
r,remote=  hostname:/path/to/repo of remote repository
t,tree     output a tree id
c,commit   output a commit id
n,name=    name of backup set to update (if any)
d,date=    date for the commit (seconds since the epoch)
v,verbose  increase log output (can be used more than once)
q,quiet    don't show progress meter
smaller=   only back up files smaller than n bytes
bwlimit=   maximum bytes/sec to transmit to server
f,indexfile=  the name of the index file (normally BUP_DIR/bupindex)
strip      strips the path to every filename given
strip-path= path-prefix to be stripped when saving
graft=     a graft point *old_path*=*new_path* (can be used more than once)
"""
o = options.Options(optspec)
(opt, flags, extra) = o.parse(sys.argv[1:])

I'm proud of many of the design decisions in bup, but so far the one with the most widespread reusability has been the standalone command-line argument parsing module, options.py (aka bup.options). The above looks like a typical program --help usage message, right? Sure. But it's not just that: it's also the code that tells the options.py how to parse your command line!

As with most of the best things I've done lately, this was not my idea. I blatantly stole the optspec format from git's little known "git rev-parse --parseopt" feature. The reimplementation in python is my own doing and includes some extra bits like [default] values in square brackets and the "--no-" prefix for disabling stuff, plus it wordwraps the help output to fit your screen. And it all fits in 233 lines of code.

I really love the idea of an input file that's machine-readable, but really looks like what a human expects to see. There's just something elegant about it. And it's *much* more elegant than what you see with most option parsing libraries, where you have to make a separate function call or data structure by hand to represent each and every option. Tons of extra punctuation, tons of boilerplate, every time you want to write a new quick command-line tool. Yuck.

options.py (and the git code it's blatantly stolen from) is designed for people who are tired of boilerplate. It parses your argv and gives you three things: opt, a magic (I'll get to that) dictionary of options; flags, a sequence of (flag,parameter) tuples; and extra, a list of non-flag parameters.

So let's say I used the optspec that started this post, and gave it a command line like "-tcn foo -vv --smaller=10000 hello --bwlimit 10k". flags would contain a list like -t, -c, -n foo, -v, -v, --smaller 10000, --bwlimit 10k. extra would contain just ["hello"]. And opt would be a dictionary that can be accessed like opt.tree (1 because -t was given), opt.commit (1 because -c was given), opt.verbose (2 because -v was given twice), opt.name ('foo' because '-n foo' was given and the 'name' option in optspec ends in an =, which means it takes a parameter), and so on.

The "magic" of the opt dictionary relates to synonyms: for example, the same option might have both short and long forms, or even multiple long forms, or a --no-whatever form. opt contains them all. If you say --no-whatever, it sets opt.no_whatever to 1 and opt.whatever to None. If you have an optspec like "w,whatever,thingy" and specify --thingy --whatever, then opt.w, opt.whatever, and opt.thingy are all 2 (because the synonyms occurred twice). Because python is great, 2 means true, so there's no reason to *not* just make all flags counters.

If you write the optspec to have an option called "no-hacky", then that means the default is opt.hacky==1, and opt.no_hacky==None. If the user specifies --no-hacky, then opt.no_hacky==1 and opt.hacky==None. Seems needlessly confusing? I don't think so: I think it actually reduces confusion. The reason is it helps you write your conditions without having double negatives. "hacky" is a positive term; an option --hacky isn't confusing, you would expect it to make your program hacky. But if the default should be hacky - and let's face it, that's often true - then you want to let the user turn it off. You could have an option --perfectly-sane that's turned off by default, but that's a bit unnatural and overstates it a bit. So we write the option as --no-hacky, which is perfectly clear to users, but write the *program* to look at opt.hacky, which keeps your code straightforward and away from double negatives, while letting you use the word that naturally describes what you're doing. And all this is implicit. It's obvious to a human what --no-hacky means, and obvious to a programmer what opt.hacky means, and that's all that matters.

What about --verbose (-v) versus --quiet (-q)? No problem! "-vvv -qq" means opt.verbose==3 and opt.quiet==2. The total verbosity is just always "(opt.verbose or 0) - (opt.quiet or 0)". (If an option isn't specified, it's "None" rather than 0, so you can tell the difference with options that take arguments. That's why we need the "or 0" trick to convert None to 0.)

Sometimes you want to provide the same option more than once and not just have it override or count previous instances. For example, if you want to have --include and --exclude options, you might want each --include to extend, rather than overwrite, the previous one. That's where the flags list comes from; it contains all the stuff in opt, but it stays in sequence, so you can do your own tricks. And you can keep using opt for all the options that don't need this special behaviour, resorting to the flags array only where needed. See a flag you don't recognize? Just ignore it, it's in opt anyway.

Options that *don't* show up in the optspec will give a KeyError when you try to look them up in opt, whether they're set or not. So given the --no-hacky option above, if you tried to look for opt.hackyy (typo!) it would crash when you try checking for the option, not just silently always return False or something.

Oh yeah, and *of course* options.py handles clusters of short options (-abcd means -a -b -c -d), equals or space (--name=this is the same as --name this), doubledash to end option parsing (-x -- -y doesn't parse the -y as an option), and smooshing of arguments into short options (-xynfoo means -x -y -n foo, if -n takes an argument and -x and -y don't).

Best of all, though, it just makes your programs more beautiful. It's carefully designed to not rely on any other source files. Please steal it for your own programs with the joy of copy-and-paste (leaving the copyright notice please) and make the world a better place!

Update 2011/11/04: The license has been updated from LGPL (like the rest of bup) to 2-clause BSD (options.py only), in order to ease copy-and-pasting into any project you want. Thanks to the people who suggested this.

Syndicated 2011-11-02 03:10:37 (Updated 2011-11-05 01:01:59) from apenwarr - Business is Programming

Vortex Update: 7 months later

Previously I wrote about my upcoming trip through the Vortex. It's no longer upcoming and I'm still on the other side, so I'm severely biased and you can't trust anything I say about it. But I thought I'd give a quick status update on my stated goals from last time:

  • Work on customer-facing real technology products. Success. Not released yet, but you'll see.
  • Help solve some serious internet-wide problems, like traffic shaping, etc. Yes, a bit, on my 20% project for now. You'll see that too. :) Must try harder.
  • Keep coding, maybe manage a small team. Yes, with more of the latter and less of the former, but the ratio is under my control.
  • Keep working on my open source projects. Sort of. Spreading myself so thin with cool projects that these are suffering, but that's nobody's fault but mine.
  • Eat a *lot* of free food. Yes, though in fact I've lost weight, giving lie to the so called "Google 20."
  • Avoid the traps of long release cycles and ignoring customer feedback. Total fail. The mechanics of this (totally under my control, but with lots of pressure to do it "wrong") are kind of interesting and I might be able to post about it later. For now let's just admit that I wanted to say my team had a product out by now (at least in public invite-only testing), and we don't, and that's mostly my fault.
  • Avoid switching my entire life to Google products. Partial success, I have an Apple TV instead. But they gave me a free Android phone that (as far as I can tell) has actual garbage collection freezes *while I'm typing on the virtual keyboard*. So... fail.
  • Produce more valuable software, including revenue, inside Google than I would have by starting a(nother) startup. Won't know until we release something.
Conclusion: mixed results. But the good news is that where things aren't as good as I'd like, the root cause can be traced to me. Does that sound like a bad thing? No! It's pretty much the ideal case when it comes to motivating me to learn fast and produce more. I'm working on the right stuff in the right ways, and the environment is well configured for me to do some amazing work. There is effectively no management interference (or input) at all. I just need to correct some of my own methodological flaws, especially trimming and prioritizing what I work on.

More later.

Syndicated 2011-10-16 07:12:00 from apenwarr - Business is Programming

610 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!