Older blog entries for kr (starting at number 21)

Introducing Doozer

We need a consistent, highly-available data store.1

There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it’s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.

That’s why I’m excited to present doozer, a consistent, highly-available data store.

Background

Soon after I started working at Heroku, Blake Mizerany got me involved in designing a “distributed init” system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients – in other words, locking.

When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn’t actually provide locks directly. Its “one thing” is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer’s primitives. This makes doozer both simpler and more powerful, as we’ll see in a moment.

Locks Are Not Primitive Enough

Other similar systems (for example Chubby, Zookeeper2) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it’s not always the most appropriate way to determine liveness. It’s particularly troublesome when dealing with off-the-shelf software that wasn’t designed to send heartbeat messages.

Doozer takes a different approach. It provides only data storage and a single, fundamental synchronization primitive, compare-and-set. This operation is complete (you can build any other synchronization operation using it), but it’s simpler than a higher-level lock, and it doesn’t require the server to have any notion of liveness of its clients.

In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they’ll be completely optional. If you don’t need their services, just don’t run them; if you need something slightly different from what they provide, you are free to make your own.

What is it good for?

How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn’t particularly fast or scalable; its claim to fame is high availability.

Doozer is where you put the family jewels.

The doozer readme has a few concrete examples, but consider this one: imagine you have three redis servers – one master and two slaves. If the master goes down, wouldn’t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You’ll have to get all the clients to agree which one to use. It doesn’t sound easy, and it’s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you’d wind up with the classic “split-brain” problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.

Using Doozer

We have client drivers for Go (doozer) and Ruby (fraggle), with an Erlang driver in the works. The doozer protocol documentation gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you’re using; they each have documentation.


  1. The words “consistency” and “high availability” have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase “eventual consistency” is a synonym for “inconsistency”. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server. Update: This is not the same as being “available” in the sense of the frequenetly-cited and almost-as-frequently-misunderstood CAP theorem. Coda Hale has a good discussion of the CAP theorem’s applicability to real-world systems. In those terms, we choose consistency over availability.

  2. Updated: Zookeeper treats locks similarly to doozer.

Syndicated 2011-04-22 04:42:20 from Keith Rarick

Introducing Doozer

We need a consistent, highly-available data store.1

There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it’s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.

That’s why I’m excited to present doozer, a consistent, highly-available data store.

Background

Soon after I started working at Heroku, Blake Mizerany got me involved in designing a “distributed init” system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients – in other words, locking.

When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn’t actually provide locks directly. Its “one thing” is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer’s primitives. This makes doozer both simpler and more powerful, as we’ll see in a moment.

Locks Are Not Primitive Enough

Other similar systems (Chubby, Zookeeper) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it’s not always the most appropriate way to determine liveness. It’s particularly troublesome when dealing with off-the-shelf software that wasn’t designed to send heartbeat messages.

Doozer takes a different approach. It provides only data storage and a single, fundamental synchronization primitive, compare-and-set. This operation is complete (you can build any other synchronization operation using it), but it’s simpler than a higher-level lock, and it doesn’t require the server to have any notion of liveness of its clients.

In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they’ll be completely optional. If you don’t need their services, just don’t run them; if you need something slightly different from what they provide, you are free to make your own.

What is it good for?

How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn’t particularly fast or scalable; its claim to fame is high availability.

Doozer is where you put the family jewels.

The doozer readme has a few concrete examples, but consider this one: imagine you have three redis servers – one master and two slaves. If the master goes down, wouldn’t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You’ll have to get all the clients to agree which one to use. It doesn’t sound easy, and it’s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you’d wind up with the classic “split-brain” problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.

Using Doozer

We have client drivers for Go (doozer/client) and Ruby (fraggle), with an Erlang driver in the works. The doozer protocol documentation gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you’re using; they each have documentation.


  1. The words “consistency” and “high availability” have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase “eventual consistency” is a synonym for “inconsistency”. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server. Update: This is not the same as being “available” in the sense of the frequenetly-cited and almost-as-frequently-misunderstood CAP theorem. Coda Hale has a good discussion of the CAP theorem’s applicability to real-world systems. In those terms, we choose consistency over availability.

Syndicated 2011-04-13 22:22:53 from Keith Rarick

Introducing Doozer

We need a consistent, highly-available data store.1

There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it’s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.

That’s why I’m excited to present doozer, a consistent, highly-available data store.

Background

Soon after I started working at Heroku, Blake Mizerany got me involved in designing a “distributed init” system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients – in other words, locking.

When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn’t actually provide locks directly. Its “one thing” is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer’s primitives. This makes doozer both simpler and more powerful, as we’ll see in a moment.

Locks Are Not Primitive Enough

Other similar systems (Chubby, Zookeeper) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it’s not always the most appropriate way to determine liveness. It’s particularly troublesome when dealing with off-the-shelf software that wasn’t designed to send heartbeat messages.

Doozer takes a different approach. It provides only data storage and a single, fundamental synchrionization primitive, compare-and-set. This operation is complete (you can build any other synchronization operation using it), but it’s simpler than a higher-level lock, and it doesn’t require the server to have any notion of liveness of its clients.

In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they’ll be completely optional. If you don’t need their services, just don’t run them; if you need something slightly different from what they provide, you are free to make your own.

What is it good for?

How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn’t particularly fast or scalable; its claim to fame is high availability.

Doozer is where you put the family jewels.

The doozer readme has a few concrete examples, but consider this one: imagine you have three redis servers – one master and two slaves. If the master goes down, wouldn’t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You’ll have to get all the clients to agree which one to use. It doesn’t sound easy, and it’s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you’d wind up with the classic “split-brain” problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.

Using Doozer

We have client drivers for Go (doozer/client) and Ruby (fraggle), with an Erlang driver in the works. The doozer protocol documentation gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you’re using; they each have documentation.


  1. The words “consistency” and “high availability” have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase “eventual consistency” is a synonym for “inconsistency”. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server. Update: This is not the same as being “available” in the sense of the frequenetly-cited and almost-as-frequently-misunderstood CAP theorem. Coda Hale has a good discussion of the CAP theorem’s applicability to real-world systems. In those terms, we choose consistency over availability.

Syndicated 2011-04-13 21:17:49 from Keith Rarick

Introducing Doozer

We need a consistent, highly-available data store.1

There is no shortage of tools that provide one of these properties or the other. For example, mysql, postgres, and redis provide consistency but not high availability; riak, couchdb, and redis cluster (when it’s ready) provide high availability but not consistency. The only freely-available tool that attempts to provide both of these is zookeeper, but zookeeper is specialized, focused on locking and server management. We need a general-purpose tool that provides the same guarantees in a clean, well-designed package.

That’s why I’m excited to present doozer, a consistent, highly-available data store.

Background

Soon after I started working at Heroku, Blake Mizerany got me involved in designing a “distributed init” system, something that could manage processes across multiple machine instances and recover gracefully from single instance failures and network partitions. One necessary building block for this is a network service that can provide synchronization for its clients – in other words, locking.

When we started building this distributed process monitor, we quickly realized that the lock service should be packaged as a separate tool. In true Unix style, it strives to do one thing well. In fact, doozer doesn’t actually provide locks directly. Its “one thing” is consistent, highly-available storage; locks are a separate thing, that clients can implement on top of doozer’s primitives. This makes doozer both simpler and more powerful, as we’ll see in a moment.

Locks Are Not Primitive Enough

Other similar systems (Chubby, Zookeeper) provide locks as a primitive operation, alongside data storage. But that requires the lock service to decide when the holder of a lock has died, in order to release the lock. In practice this means that every client that wants to obtain a lock must establish a session and send periodic heartbeat messages to the server. This works well enough for some clients, but it’s not always the most appropriate way to determine liveness. It’s particularly troublesome when dealing with off-the-shelf software that wasn’t designed to send heartbeat messages.

Doozer takes a different approach. It provides only data storage and a single, fundamental synchrionization primitive, compare-and-set. This operation is complete (you can build any other synchronization operation using it), but it’s simpler than a higher-level lock, and it doesn’t require the server to have any notion of liveness of its clients.

In the future, doozer might ship with companion tools that provide higher-level synchronization for users that want it, but these tools will operate as regular doozer clients, and they’ll be completely optional. If you don’t need their services, just don’t run them; if you need something slightly different from what they provide, you are free to make your own.

What is it good for?

How does doozer compare to other data stores out there? Redis is amazingly fast, with lots of interesting data structures to play with. HBase is amazingly scalable. Doozer isn’t particularly fast or scalable; its claim to fame is high availability.

Doozer is where you put the family jewels.

The doozer readme has a few concrete examples, but consider this one: imagine you have three redis servers – one master and two slaves. If the master goes down, wouldn’t it be nice to promote one of the slaves to be the new master? Imagine trying to automate that promotion. You’ll have to get all the clients to agree which one to use. It doesn’t sound easy, and it’s even harder than it sounds. If there is a network partition, some of the clients might disagree about which slave to promote, and you’d wind up with the classic “split-brain” problem. If you want this promotion to work reliably, you have to use a tool like doozer, which guarantees consistency, to coordinate the fail-over.

Using Doozer

We have client drivers for Go (doozer/client) and Ruby (fraggle), with an Erlang driver in the works. The doozer protocol documentation gives the nitty-gritty of talking to doozer, but most of you will be interested in the interface provided by the driver you’re using; they each have documentation.


  1. The words “consistency” and “high availability” have several reasonable definitions, and people too often use them without saying exactly what they mean. When I speak of consistency, I mean absolute prevention of inconsistent writes. The phrase “eventual consistency” is a synonym for “inconsistency”. When I speak of high availability, I mean primarily the ability to face a network partition and continue providing write service to all clients transitively connected to a majority of servers. Secondarily, I mean the ability to continue providing read-only service to all clients connected to any server.

Syndicated 2011-04-13 07:00:01 from Keith Rarick

Joining Heroku

I’m happy to report that I’ll be joining the team at Heroku, starting tomorrow.

I’m thrilled to be able to work on such exciting technology alongside people this smart, talented, and accomplished, inside a small, fast-moving company. This is set to be an amazing learning experience. In many ways, this is my dream job.

This will also be the first time I work for a company where open source is so well-understood and well-integrated in the company culture. Heroku breathes open-source software, both inward and outward. I’m excited at what this means for my ability to contribute to beanstalkd and other projects in the future.

Syndicated 2010-07-18 07:00:00 from Keith Rarick

About Distributed Social Networking

So, Diaspora, DiSo, Appleseed, and a bunch of others. Despite the incredible attractiveness of this solution to the problem of Facebook/Twitter centralization, I find it hard to get excited about any of these projects. They are all doing it wrong, but not exactly for that reason.

I suspect this is both easier and harder than they think:

  • Harderdesign. People. Product. Design matters. At all levels. Details matter. At all levels.
  • Easiertechnology. Web pages, feed, pub-sub. Done.
  • Harder – improvements, updates, protocol changes, specs, consensus.
  • Easier – avoid specs and consensus; dictate that shit. Just leave room for add-ons.
  • Harder – end-to-end model is broken. Network 10 considered harmful. Plus, end nodes (aka laptops) are not always connected.

If you have opinions on this, join one of these projects or, better, start your own.

Yeah, there are a bunch already out there, but they mostly suck, so you have a good chance of beating them all if yours can be excellent.

Syndicated 2010-06-27 07:00:00 from Keith Rarick

How to Handle Job Failures

There’s a discussion on the beanstalkd mailing list right now about queue introspection and handling failures. My response got a little long, and it could be interesting to users of other queueing systems as well, so here’s a blog post instead.

When we first started using beanstalkd at Causes, some things in our worker development and deployment process took a while to iron out, bur our strategy for handling job failures worked quite well right from the start. In hindsight, I’m happy about it. This is what we did.

The Basic Rule

Never clean up jobs by hand. If a failure happens once, it can happen again. Always write code to handle newly-discovered failure types automatically, then run the new code to do the cleanup.

Procedure

Before you begin, note that your workers will be numerous, possibly even more so than your web front-ends. I assume you have good logging infrastructure and analysis tools for your web front ends. Use the same infrastructure for the workers, too. It will make your life easier to see all failures and performance data in one place.

  1. Start by having your workers bury any failed jobs.

  2. See what sorts of failures happen in production (by using the high-quality logging that you have to do anyway).

  3. You will see some failures where the job can simply be deleted, others where it’s better to retry the job, and possibly some rare cases where you want to save the job to be inspected by a human (though this sort of hand-holding does not scale and should be avoided). It might also make sense to retry some jobs only a limited number of times before deleting them.

  4. Add unit tests and update the code to deal with these known failure types appropriately (i.e. delete or retry the job), but continue to bury unanticipated failures. For retries, don’t bother with changing the priority, but do add a time delay with exponential backoff. Of course, you must also fix the business logic to recover from these failures or avoid them entirely whenever possible.

  5. Redeploy your application.

  6. When the new code is in production, kick all buried jobs. They will be handled correctly, and you won’t lose any jobs.

  7. Now look at your worker logs again. This process will have removed a lot of noise from your production logs, and new failure types will float to the surface (though the total volume will of course be much smaller). So repeat.

After a couple of iterations, true failures will be very rare indeed. Your system will be running smoothly and it won’t need much attention.

Syndicated 2010-05-02 07:00:00 from Keith Rarick

The Closed iPad is a Moral Problem

At issue here is control. Apple wants to control what you can and can’t do with your computer. (To my knowledge no one has claimed this is false. Speculate all you like on Apple’s motivation for wanting this control; that’s beside the point.) I happen to find this morally objectionable.

Cory Doctorow and others have astutely noticed that people don’t respond much to arguments based on morality, so they framed their complaints differently, emphasizing practical effects. That was a smart strategy, because it let them be more persuasive, but make no mistake, this is a moral issue.

Unfortunately, some have failed to see past the surface of these arguments, causing them to write a bunch of increasingly irrelevant rebuttals.

Ultimately, I think both sides of this “debate” are falling victim to a massive confirmation bias. If you read a statement like this:

What makes products great is their innovation, their creativity, other ineffable qualities. Not the applicability of the first-sale doctrine.

You may just nod in agreement, or you may say, “hold on there, bucko, that’s a hefty assertion, but an assertion is not an argument (or even evidence).” Same goes for something like this:

Buying an iPad for your kids isn’t a means of jump-starting the realization that the world is yours to take apart and reassemble; it’s a way of telling your offspring that even changing the batteries is something you have to leave to the professionals.

A hundred little implicit (dis)agreements get strung together when you read one of these essays, and determine whether you find it convincing or repulsive.

The confirmation bias is especially strong here because everyone dances around the real issue without saying it outright: the closed nature of the iPad is morally wrong. As with any moral issue, it isn’t something you can argue for or against effectively without a groundwork of shared values. Either you recognize this issue or not. Either you consider it important or not.

Folks, of course the iPad will sell lots of units, because, in spite of its moral bankrupcy, it appeals to mass-market consumerism, and because it is backed by Apple’s powerful marketing machine. This may or may not qualify as “success”, depending on your point of view.

Untouchable Design

Why are Apple fans so worked up about this device, really?. Because of its revolutionary design?

The bet is roughly that the future of computing:

  • has a UI model based on direct manipulation of data objects
  • completely hides the filesystem from the user
  • favors ease of use and reduction of complexity over absolute flexibility
  • favors benefit to the end-user rather than the developer or other vendors
  • lives atop built-to-specific-purpose native applications and universally available web apps

Thing is, that describes the litl spot-on. I think excitement about the iPad is much less about its design, and much more about the simple fact of Apple’s market position. If these radical design principles were really so important, folks would have been just as excited about the litl’s launch way way back in November.

This especially undermines all those put-up-or-shut-up arguments about how nobody else competes with Apple’s design and that’s why the iPad is great despite its closed nature. I have yet to see a single thoughtful comment claiming that the iPad is good while the litl simultaneously is not. If someone manages to do this, not through speculation, but having actually used the litl (and even if we may disagree on the details or the conclusion), then great. Until then, you can’t credibly claim that no-one but Apple produces good design.

Syndicated 2010-04-06 07:00:00 from Keith Rarick

Don’t Copy the Call Stack

Some runtimes claim to provide first-class continuations, but implement this by copying the entire call stack. This implementation strategy makes continuations totally unusable in production code, and it should be outlawed. Or maybe such runtimes should be required to call them “shitty continuations” instead of just “continuations”.

Syndicated 2010-02-06 08:00:00 from Keith Rarick

What to Look For in a Programming Language

Occasionally, I get asked what I look for in a programming language, what makes a good language, or what I would do to improve an existing language. Mainstream programming languages (which are almost always applicative with call-by-value evaluation) vary surprisingly little in their abilities, but there are a few significant differences. Aside from all the usual, obvious aspects that don’t need to be repeated, I look for two big things:

  • Powerful flow-control semantics including tail recursion and continuations. For example, scheme.
  • Asynchronous API, especially with a built-in event loop. For example, node.js.

These features are even more useful when combined, yet I’ve never seen a language with both. (So, to my knowledge, sodium will be the first!)

Syndicated 2010-01-14 08:00:00 from Keith Rarick

12 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!