Older blog entries for Stevey (starting at number 766)

We're all about storing objects

Recently I've been experimenting with camlistore, which is yet another object storage system.

Camlistore gains immediate points because it is written in Go, and is a project initiated by Brad Fitzpatrick, the creator of Perlbal, memcached, and Livejournal of course.

Camlistore is designed exactly how I'd like to see an object storage-system - each server allows you to:

  • Upload a chunk of data, getting an ID in return.
  • Download a chunk of data, by ID.
  • Iterate over all available IDs.

It should be noted more is possible, there's a pretty web UI for example, but I'm simplifying. Do your own homework :)

With those primitives you can allow a client-library to upload a file once, then in the background a bunch of dumb servers can decide amongst themselves "Hey I have data with ID:33333 - Do you?". If nobody else does they can upload a second copy.

In short this kind of system allows the replication to be decoupled from the storage. The obvious risk is obvious though: if you upload a file the chunks might live on a host that dies 20 minutes later, just before the content was replicated. That risk is minimal, but valid.

There is also the risk that sudden rashes of uploads leave the system consuming all the internal-bandwith constantly comparing chunk-IDs, trying to see if data is replaced that has been copied numerous times in the past, or trying to play "catch-up" if the new-content is larger than the replica-bandwidth. I guess it should possible to detect those conditions, but they're things to be concerned about.

Anyway the biggest downside with camlistore is documentation about rebalancing, replication, or anything other than simple single-server setups. Some people have blogged about it, and I got it working between two nodes, but I didn't feel confident it was as robust as I wanted it to be.

I have a strong belief that Camlistore will become a project of joy and wonder, but it isn't quite there yet. I certainly don't want to stop watching it :)

On to the more personal .. I'm all about the object storage these days. Right now most of my objects are packed in a collection of boxes. On the 6th of next month a shipping container will come pick them up and take them to Finland.

For pretty much 20 days in a row we've been taking things to the skip, or the local charity-shops. I expect that by the time we've relocated the amount of possesions we'll maintain will be at least a fifth of our current levels.

We're working on the general rule of thumb: "If it is possible to replace an item we will not take it". That means chess-sets, mirrors, etc, will not be carried. DVDs, for example, have been slashed brutally such that we're only transferring 40 out of a starting collection of 500+.

Only personal, one-off, unique, or "significant" items will be transported. This includes things like personal photographs, family items, and similar. Clothes? Well I need to take one jacket, but more can be bought. The only place I put my foot down was books. Yes I'm a kindle-user these days, but I spent many years tracking down some rare volumes, and though it would be possible to repeat that effort I just don't want to.

I've also decided that I'm carrying my complete toolbox. Some of the tools I took with me when I left home at 18 have stayed with me for the past 20+ years. I don't need this specific crowbar, or axe, but I'm damned if I'm going to lose them now. So they stay. Object storage - some objects are more important than they should be!

Syndicated 2015-06-21 00:00:00 from Steve Kemp's Blog

I'm still moving, but ..

Previously I'd mentioned that we were moving from Edinburgh to Newcastle, such that my wife could accept a position in a training-program, and become a more specialized (medical) doctor.

Now the inevitable update: We're still moving, but we're no longer moving to Newcastle, instead we're moving to Helsinki, Finland.

Me? I care very little about where I end up. I love Edinburgh, I always have, and I never expected to leave here, but once the decision was made that we needed to be elsewhere the actual destination does/didn't matter too much to me.

Sure Newcastle is the home of Newcastle Brown Ale, and has the kind of proper-Northern accents I both love and miss but Finland has LeipƤjuusto, Saunas, and lovely people.

Given the alternative - My wife moves to Finland, and I do not - Moving to Helsinki is a no-brainer.

I'm working on the assumption that I can keep my job and work more-remotely. If that turns out not to be the case that'll be a real shame given the way the past two years have worked out.

So .. 60 days or so left in the UK. Fun.

Syndicated 2015-06-13 00:00:00 from Steve Kemp's Blog

A brief examination of tahoe-lafs

Continuing the theme from the last post I made, I've recently started working my way down the list of existing object-storage implementations.

tahoe-LAFS is a well-established project which looked like a good fit for my needs:

  • Simple API.
  • Good handling of redundancy.

Getting the system up and running, on four nodes, was very simple. Setup a single/simple "introducer" which is a well-known node that all hosts can use to find each other, and then setup four deamons for storage.

When files are uploaded they are split into chunks, and these chunks are then distributed amongst the various nodes. There are some configuration settings which determine how many chunks files are split into (10 by default), how many chunks are required to rebuild the file (3 by default) and how many copies of the chunks will be created.

The biggest problem I have with tahoe is that there is no rebalancing support: Setup four nodes, and the space becomes full? You can add more nodes, new uploads go to the new nodes, while old ones stay on the old. Similarly if you change your replication-counts because you're suddenly more/less paranoid this doesn't affect existing nodes.

In my perfect world you'd distribute blocks around pretty optimistically, and I'd probably run more services:

  • An introducer - To allow adding/removing storage-nodes on the fly.
  • An indexer - to store the list of "uploads", meta-data, and the corresponding block-pointers.
  • The storage-nodes - to actually store the damn data.

The storage nodes would have the primitives "List all blocks", "Get block", "Put block", and using that you could ensure that each node had sent its data to at least N other nodes. This could be done in the background.

The indexer would be responsible for keeping track of which blocks live where, and which blocks are needed to reassemble upload N. There's probably more that it could do.

Syndicated 2015-05-28 00:00:00 from Steve Kemp's Blog

On de-duplicating uploaded file-content.

This evening I've been mostly playing with removing duplicate content. I've had this idea for the past few days about object-storage, and obviously in that context if you can handle duplicate content cleanly that's a big win.

The naive implementation of object-storage involves splitting uploaded files into chunks, storing them separately, and writing database-entries such that you can reassemble the appropriate chunks when the object is retrieved.

If you store chunks on-disk, by the hash of their contents, then things are nice and simple.

The end result is that you might upload the file /etc/passwd, split that into four-byte chunks, and then hash each chunk using SHA256.

This leaves you with some database-entries, and a bunch of files on-disk:

/tmp/hashed/ef267892ee080862c96a8d2d05de62f48e20f0875f27379e7d58c73ea4455bf1
/tmp/hashed/a378977155fb42bb006496321cbe31f74cbda803c3f6ca590f30e76d1afad921
..
/tmp/hashed/3805b0245bc8375be7125ae228eef711552ac082ffb9bf8756e2964a2393a9de

In my toy-code I wrote out the data in 4-byte chunks, which is grossly ineffeciant. But the value of using such small pieces is that there is liable to be a lot of collisions, and that means we save-space. It is a trade-off.

So the main thing I was experimenting with was the size of the chunks. If you make them too small you lose I/O due to the overhead of writing out so many small files, but you gain because collisions are common.

The rough testing I did involved using chunks of 16, 32, 128, 255, 512, 1024, 2048, and 4096 bytes. As sizes went up the overhead shrank, but also so did the collisions.

Unless you could handle the case of users uploading a lot of files like /bin/ls which are going to collide 100% of the time with prior uploads using larger chunks just didn't win as much as I thought they would.

I wrote a toy server using Sinatra & Ruby, which handles the splitting/hashing/and stored block-IDs in SQLite. It's not so novel given that it took only an hour or so to write.

The downside of my approach is also immediately apparent. All the data must live on a single machine - so that reassmbly works in the simple fashion. That's possible, even with lots of content if you use GlusterFS, or similar, but it's probably not a great approach in general. If you have large capacity storage avilable locally then this might would well enough for storing backups, etc, but .. yeah.

Syndicated 2015-05-07 00:00:00 from Steve Kemp's Blog

A weekend of migrations

This weekend has been all about migrations:

Host Migrations

I've migrated several more systems to the Jessie release of Debian GNU/Linux. No major surprises, and now I'm in a good state.

I have 18 hosts, and now 16 of them are running Jessie. One of them I won't touch for a while, and the other is a KVM-host which runs about 8 guests - so I won't upgraded that for a while (because I want to schedule the shutdown of the guests for the host-reboot).

Password Migrations

I've started migrating my passwords to pass, which is a simple shell wrapper around GPG. I generated a new password-managing key, and started migrating the passwords.

I dislike that account-names are stored in plaintext, but that seems known and unlikely to be fixed.

I've "solved" the problem by dividing all my accounts into "Those that I wish to disclose post-death" (i.e. "banking", "amazon", "facebook", etc, etc), and those that are "never to be shared". The former are migrating, the latter are not.

(Yeah I'm thinking about estates at the moment, near-death things have that effect!)

Syndicated 2015-05-04 00:00:00 from Steve Kemp's Blog

Validating puppet manifests via git hooks.

It looks like I'll be spending a lot of time working with puppet over the coming weeks.

I've setup some toy deployments on virtual machines, and have converted several of my own hosts to using it, rather than my own slaughter system.

When it comes to puppet some things are good, and some things are bad, as exected, and as any similar tool (even my own). At the moment I'm just aiming for consistency and making sure I can control all the systems - BSD, Debian GNU/Linux, Ubuntu, Microsoft Windows, etc.

Little changes are making me happy though - rather than using a local git pre-commit hook to validate puppet manifests I'm now doing that checking on the server-side via a git pre-receive hook.

Doing it on the server-side means that I can never forget to add the local hook and future-colleagues can similarly never make this mistake, and commit malformed puppetry.

It is almost a shame there isn't a decent collection of example git-hooks, for doing things like this puppet-validation. Maybe there is and I've missed it.

It only crossed my mind because I've had to write several of these recently - a hook to rebuild a static website when the repository has a new markdown file pushed to it, a hook to validate syntax when pushes are attempted, and another hook to deny updates if the C-code fails to compile.

Syndicated 2015-04-27 00:00:00 from Steve Kemp's Blog

skx-www upgraded to jessie

Today I upgraded my main web-host to the Jessie release of Debian GNU/Linux.

I performed the upgraded by changing wheezy to jessie in the sources.list file, then ran:

apt-get update
apt-get dist-upgrade

For some reason this didn't upgrade my kernel, which remained the 3.2.x version. That failed to boot, due to some udev/systemd issues (lots of "waiting for job: udev /dev/vda", etc, etc). To fix this I logged into my KVM-host, chrooted into the disk image (which I mounted via the use of kpartx), and installed the 3.16.x kernel, before rebooting into that.

All my websites seemed to be OK, but I made some changes regardless. (This was mostly for "neatness", using Debian packages instead of gems, and installing the attic package rather than keeping the source-install I'd made to /opt/attic.)

The only surprise was the significant upgrade of the Net::DNS perl-module. Nothing that a few minutes work didn't fix.

Now that I've upgraded the SSL-issue I had with redirections is no longer present. So it was a worthwhile thing to do.

Syndicated 2015-04-18 00:00:00 from Steve Kemp's Blog

Subject - Verb Agreement

There's pretty much no way that I can describe the act of cutting a live, 240V mains-voltage, wire in half with a pair of scissors which doesn't make me look like an idiot.

Yet yesterday evening that is exactly what I did.

There were mitigating circumstances, but trying to explain them would make little sense unless you could see the scene.

In conclusion: I'm alive, although I almost wasn't.

My scissors? They have a hole in them.

Syndicated 2015-04-14 00:00:00 from Steve Kemp's Blog

Some things get moved, some things get doubled in size.

Relocation

We're about three months away from relocating from Edinburgh to Newcastle and some of the immediate panic has worn off.

We've sold our sofa, our spare sofa, etc, etc. We've bought a used dining-table, chairs, and a small sofa, etc. We need to populate the second-bedroom as an actual bedroom, do some painting, & etc, but things are slowly getting done.

I've registered myself as a landlord with the city council, so that I can rent the flat out without getting into trouble, and I'm in the process of discussing the income possabilities with a couple of agencies.

We're still unsure of precisely which hospital, from the many choices, in Newcastle my wife will be stationed at. That's frustrating because she could be in the city proper, or outside it. So we need to know before we can find a place to rent there.

Anyway moving? It'll be annoying, but we're making progress. Plus, how hard can it be?

VLAN Expansion

I previously had a /28 assigned for my own use, now I've doubled that to a /27 which gives me the ability to create more virtual machines and run some SSL on some websites.

Using SNI I've actually got the ability to run SSL almost all sites. So I configured myself as a CA and generated a bunch of certificates for myself. (Annoyingly few tutorials on running a CA mentioned SNI so it took a few attempts to get the SAN working. But once I got the hang of it it was simple enough.)

So if you have my certificate authority file installed you can browse many, many of my interesting websites over SSL.

SSL

I run a number of servers behind a reverse-proxy. At the moment the back-end is lighttpd. Now that I have SSL setup the incoming requests hit the proxy, get routed to lighttpd and all is well. Mostly.

However redirections break. A request for:

  • https://lumail.org/docs

Gets rewritten to:

  • http://lumail.org/docs/

That is because lighttpd generates the redirection and it only sees the HTTP connection. It seems there is mod_extforward which should allow the server to be aware of the SSL - but it doesn't do so in a useful fashion.

So right now most of my sites are SSL-enabled, but sometimes they'll flip to naked and unprotected. Annoying.

I don't yet have a solution..

Syndicated 2015-04-11 00:00:00 from Steve Kemp's Blog

Moving to Newcastle

Although things are not 100% certain it seems highly likely we'll be moving to Newcastle in five months time.

If I seem distracted/absent/busy over the next month or two this will be a good excuse!

Syndicated 2015-03-14 00:00:00 from Steve Kemp's Blog

757 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!