Older blog entries for Pizza (starting at number 110)

dmraid sucks. mdraid sucks less. 3Ware FTW.

About two hours ago I was happily puttering around in a shell on the server that hosts most of my digital existence. Until, after a routine software update, I was suddenly presented with some disturbing sights:

[pizza@stuffed] ~]$ yum  
Bus error
[pizza@stuffed] ~]$ dmesg
-bash: /bin/dmesg: input/output error

It seems that something had gone very, very wrong. I trumped home to find the console full of disk error messages. But that shouldn't have brought the system down; The OS was installed on a pair of drives set up in a RAID1 mirror. There's no excuse for a read failure -- If one drive failed, the other should have picked up the slack and kept the system working.

Well, *should*. I made the mistake of setting up the RAID1 mirror using the "dmraid" tools, which piggyback on the motherboard's "fakeraid" metadata. Unfortunately, it seems that this mode of operation doesn't handle failures worth a damn. And that there's no easy way to migrate from the "dmraid" stuff to the very mature and robust native Linux "dmraid" tools.

The dmraid tools have one major disadvantage though -- they are much more dificult to boot from, and from the motherbord's perspective, if the primary drive fails, the array becomes unbootable. So by using dmraid intead of mdraid, I ended up trading one (minor) failure case for a much more serious one.

How does one work around this quandry? By going for a real hardware RAID controller. The ones with onboard pocessors that do all the heavy lifting. The ones that hide the messy details from the OS. The ones that are designed to JustWork(tm) and NotFail(tm).

The ironic thing is that this server already has a hardware RAID controller in it, a 3Ware 9550SXU-8LP, but it's maxed out with two 4-drive RAID5 arrays totalling about 7TB. This server's predecessors both had 3Ware RAID controllers -- and in my decade-long experience with 3Ware controllers, I've not had a single controller-related failure, ever. They JustWork(tm), handling too-numerous-to-count drive failures cleanly and transparently.

So, I just ordered another 3Ware 9550SXU-4LP controller to plug my OS array into. It's used so I got it for cheapcheapcheap, and like the one I already have it's two generations older than their current top-line models, but it'll be more than good enough for a simple RAID1 array. Migrating the dmraid array over to the new controller will be an interesting proposition even without the flaky drive complicating things, but since downtime is unavoidable thanks to dmraid's failings, I might as well make sure things are done right.

If I hadn't picked up another hardware controller, I'd have migrated to mdraid, and booted off of an USB stick. That solution has worked great at work, and those USB sticks are trivial to back up and replicate in case of failures.

In other news, the tally from this weekend is 5581 RAW photos taking up some 62Gigs of space. Been to busy to post anything, but as my backlog empties out, you'll be seeing more here again.

Syndicated 2011-02-07 18:31:29 from Solomon Peachy

On 802.11

On 802.11:

apenwarr waxed poetic about various aspects of the 802.11 standard, and as someone who knows far more about the specs than any supposedly-sane person ever should, I think I can clarify a few things.

First, if you're trying to get up to speed with it, I suggest you start with the original 802.11-1999 spec. Everything else is built on top of this, and the current 1200 pages of so of the (incomplete!) rollup spec is due to the numerous amendments that generally just add complexity and confusion when you don't know what the core protocol is about.

Second, you have to remember that 802.11 is a *radio* protocol, not a wired protocol, and subsequently the fundamental Physical (PHY) layer is completely different than a wired layer because of this. That said, your two basic questions are actually due to the 802.2/3 heritage that 802.11 builds on. But first your bit about antennae.

Radio waves aren't constrained to nice neat cables. They radiate outwards in all directions. They bounce off of things, and these bounced signals may make it to the receiver too. Since these bounced signals have to travel further, they arrive later than the original signal, causing ghosts some of you may remember from the analog TV days. This phenomena is called multipath interference, and there's a ton of complexity in the receivers to detect and deal with this.

802.11n gets its speedups over 802.11g from three things: Better modulation (65Mbps vs 54Mbps); Wider bandwidth (40MHz channels vs 20MHz, doubling throughput to 130Mbps), and finally supporting multiple simultaneous streams (which takes us up to 600Mbps with four streams, but nothing I've seen supports more than 300Mbps with two streams). The problem is that you can't transmit multiple streams from the same antenna; they'll interfere with each other. That same multipath mess that causes problems before is instead deliberately harnessed -- but to do that, you need need multiple antennae, one for each spatial stream. Similarly, you'll need multiple antennae on the receiver, one more than the number of streams.

Meanwhile. Wired ethernet is not considered "reliable" but compared to wireless, it is bulletproof. Wired ethernet can detect collisions as they happen due to every transmitter sharing the same wire (CSMA/CD), but Wireless transmitters have no way of knowing if the receiver was being locally interfered with or not. This is the reason for adding a positive acknowledgement and retransmissions at such a low level.

Similarly, stations may (and often are) highly mobile, and may drop off of a network at any time. If the station connects to a different access point, how is the rest of the network to know that it's moved? By making association an explicit action, the AP knows to send a notification to the rest of the network to update their MAC address tables. Which brings us to "why can't we join networks simultaneously?" Fundamentally, it's because each radio only has one MAC address. If the station supported using multiple MAC addresses, then it could join multiple networks. There are other factors in play (mainly synchronization/timing; a STA is slaved to the AP's clocks), but that's one of the big ones. Oh, and the 802.11 spec can't just assume everyone's using IP, and there are 802.11 chipsets that support multiple MAC addresses.

Disassociate messages are there to explicitly tell the AP to free up the resources that the STA is using. It's not strictly necessary, but instead a highly useful optimization when you consider the bigger picture of multiple APs servicing the same logical network and that a single AP can only handle so many STAs before they interfere themselves into oblivion or simply run out of resources. (Anyone who's been to tradeshows with public wifi has seen this for themselves). Also keep in mind that the AP can also send out disassociations to force the STA to hand off to a different AP or if the AP has to go away for some reason (such as switching channels due to radar interference). Explicit notifications are always preferable to implicit ones, especially on a highly unreliable medium.

QoS stuff is (unfortunately) here to stay, and provides tangible throughput improvements by adding additional mechanisms to reduce collisions and minimize round trips (and their latencies, the real throughput killer), something that "over-provisioning" simply can't deal with -- remember, you can always add more wires bonded together ad nauseum, but you can't just add more RF spectrum to achieve the same thing. Again, radio, being a shared medium, is completely different. The nodes all have to be smart to not step on each other's toes or the whole house of cards collapses.

802.11 as it stands now is actually pretty well designed; it's complicated because it is trying to solve some very complicated problems. Trying to grok the whole thing at once is migraine-inducing, but if you start from the original 802.11-1999 spec, work your way through the amendments chronologically, and keep in mind its ethernet heritage (and the fundamental differences RF brings over a hardline connection) it'll make more sense more quickly.

Anyway, I'll shut back up now..

Syndicated 2010-11-09 16:44:57 from Solomon Peachy

17 Apr 2010 (updated 9 Nov 2010 at 18:09 UTC) »

Asus G50Vt quirks with Linux

Over the months I've owned this laptop, I've run into quite a few little quirks, and have been slowly knocking them out one by one. Tonight I managed to get the last of 'em quashed, and I thoght I'd write everything up in case anyone else finds this stuff useful.

Among the quirks: (and note these are current as of 2.6.32)

Buggy IOMMU. If you enable virtualization in the BIOS, the kernel spews out boatloads of iommu warnings/errors due to some kind of glitch. It's presumably a kernel bug, but until it's fixed, you gotta add 'intel_iommu=off' to the kernel command line.

Keyboard lagging This was particularly annoying, but adding 'i8042.noloop i8042.nopnp' to the kernel command line made quite a difference. It ain't perfect, but it's at least no longer infuriating. It's worth noting that the laggy keyboard also afflicts operation under Vista too -- only I can't do anything about that!

ExpressCard slot not working post-suspend. Due to a kernel bug, the PCIExpress hotplug driver wasn't working unless the ' pciehp.pciehp_force=1' option is added to the kernel command line. Now I can hotplug ExpressCards left and right, yay.

Post-suspend, the ethernet interface not coming up with GigE speeds. Apparently the network chipset somehow decides to stop advertising 1000Mbps support to the switch, so it's not autonegotiated properly. The workaround is pretty simple: As root, run:

'ethtool -s eth0 speed 1000M'
and the hardware re-enables 1000M operation and negotiates with the switch properly.

Headphone jack not working. When something's plugged into the headphone jack, the speakers are muted -- and so is the headphone jack. if you sorta plug it in loosely, you hear both the speakers and the headphone simultaneously. Something's screwy with the HDA Codec routings! This was a particularly annoying fix to solve, and has been broken since the 2.6.28 kernel. There's been a patch rotting in ALSA's bugtracker for more than a year now.

Tonight I got sick of the headphone jack screwiness, and started poking around and discovered the 'hda analyzer' tool; and when poking around with the various routing lines referenced by the patch, I suddenly start hearing output from the headphones. Plugging and unplugging them JustWorks(tm). Apparently the default routing for the headphone jack is incorrect. A bit more digging led me to the hda-verb tool, which can then be invoked with:

hda-verb /dev/snd/hwC0D0 0x21 SET_CONNECT_SEL 0x0d
And voila, everything JustWorks(tm). Yay!

I added a rule to the pm-utils scripts that kicks both ethtool and hda-verb to fix things up after a suspend, and I'm as happy as a clam now.

When combined with the 'asusg50oled' app that does something useful with the little OLED display above the keyboard, this laptop is actually more functional in Linux than it is under Vista -- It won't come out of a suspend properly there -- and I can even directly control the LEDs around the touchpad. oooo.

Other joy I encountered included a slightly buggy ExpressCard CompactFlash adapter. Modern CF cards, basically being full PATA implementations, can operate at UDMA speeds -- current high-end cards can sustain 90MB/s throughput, well in excess of the ~25-ish that the best USB-based readers can accomplish. I picked up a 'SIIG ExpressCard/54 R/W' adapter, which promises to let me run all-out. So, once I got the hotplug problem fixed up, I slapped in my UDMA-capable CF card.. and.. barely managed 20MB/s, actually worsethan my USB reader.

Further investigation showed that the kernel PATA later was saying that the '80-wire detection' was failing, forcing the card to revert to at most UDMA-33 speeds. Apparently the CF adapter wasn't reporting the proper cable status -- CF cards by definition are 40-pin, but the cables are nonexistent so they can operate at full UDMA/133 speeds if they support it. Unfortutaely, SIIG used the same PCI subvendor/model ID as the reference design for the PATA chipset... so there was no way to hack in a kernel quirk to work around this buggy hardware. But not all was lost!

There was a way to force this to be overridden, but only on a per-adapter basis -- and since the ExpressCard is removable, each time a hotplug event happens it's assigned a new adapterid. I was able to work around this by forcing the kernel to default to a "short 40-pin cable" mode, and explicitly setting the hard drive and DVD drive to full sata operation -- adding 'libata.force=short40c,1:sata,2:sata' to my kerel command line. And my "300X" CF cards sustain 40MB/s. Whee!

So what's my cmdline after all of this?

"ro root=/dev/sda7 rhgb pciehp.pciehp_force=1 intel_iommu=off SYSFONT=ter-u16b LANG=en_US.UTF-8 KEYTABLE=us libata.force=short40c,1:sata,2:sata i8042.noloop i8042.nopnp"
Yeah, it's a mouthful..

Syndicated 2010-04-17 04:33:17 (Updated 2010-11-09 18:09:47) from Solomon Peachy

To PostGIS Or Not To PostGIS, that is the question...

Development of Photo Organizer has slowed down lately, in part thanks to RealLife(tm) getting in the way, but mostly due to the remaining feature requests becoming increasingly more invasive. This isn't to say that these features aren't a good idea, but rather that due to PO's craptastic code structure, a seemingly "minor feature" would require a major internal overhaul.

Features like replacing the internal permission model with a finer-grained group-based model. Moving to a real templating engine. Better "social" features. Adding an external RPC API. Adding some sort of caching of search results or other complex queries that involve permission tests. And so on.

One deceptively simple feature request is to integrate PostGIS support. While PO currently extracts GPS data out of images and stores it in the database, it doesn't really do anything useful with that data. Integrating PostGIS support would instantly give PO access to a very powerful geospatial backend that can tie in to all sorts of other spatially-aware systems. There is a near-endless list of upsides, even if PO never uses anything more advanced than spatially-aware searching.

The downsides, however, are doozeys -- From an administration perspective rather than from a code perspecitve. First, due to the level of effort it would take to make PostGIS support optional, we'd have to require it across the board. PostGIS is not part of the standard PostgreSQL distribution, and would consequently make setting up a PO installation more difficult. It would greatly complicate upgrading an existing PO installation to a newer version of PostgreSQL and/or PostGIS, and upgrading to newer PO releases could also get more complex.

So all of that said, PostGIS support would be interesting and cool, but is it necessarily the right direction to take? I know PO is already used by at least one municipality to hold photos relating to their tax rolls, but without a better idea of real-world workflows, I don't know what PO can do to better tie in to the rest of their (or anyone else's) systems.

Meanwhile, regardless of PO's support for PostGIS, more user-visible features like "pull up a google map with locations of this set of photos marked" can be implemented, and now that I have a GPS widget for my camera, I'm actually interested in such things. :)

I get nearly no feedback from PO users; indeed aside from the freshmeat subscriber stats I really have no idea how many folks actually use PO. My best efforts with Google show a few dozen public PO installations, including at least two which the admins have independently translated into Russian. Come on folks, send me patches so all users can benefit from this work!

So, peanut gallery, any thoughts?

Syndicated 2008-10-13 16:46:41 from Solomon Peachy

13 Oct 2008 (updated 18 Nov 2008 at 04:11 UTC) »

To PostGIS Or Not To PostGIS, that is the question...

Development of Photo Organizer has slowed down lately, in part thanks to RealLife(tm) getting in the way, but mostly due to the remaining feature requests becoming increasingly more invasive. This isn't to say that these features aren't a good idea, but rather that due to PO's craptastic code structure, a seemingly "minor feature" would require a major internal overhaul.

Features like replacing the internal permission model with a finer-grained group-based model. Moving to a real templating engine. Better "social" features. Adding an external RPC API. Adding some sort of caching of search results or other complex queries that involve permission tests. And so on.

One deceptively simple feature request is to integrate PostGIS support. While PO currently extracts GPS data out of images and stores it in the database, it doesn't really do anything useful with that data. Integrating PostGIS support would instantly give PO access to a very powerful geospatial backend that can tie in to all sorts of other spatially-aware systems. There is a near-endless list of upsides, even if PO never uses anything more advanced than spatially-aware searching.

The downsides, however, are doozeys -- From an administration perspective rather than from a code perspecitve. First, due to the level of effort it would take to make PostGIS support optional, we'd have to require it across the board. PostGIS is not part of the standard PostgreSQL distribution, and would consequently make setting up a PO installation more difficult. It would greatly complicate upgrading an existing PO installation to a newer version of PostgreSQL and/or PostGIS, and upgrading to newer PO releases could also get more complex.

So all of that said, PostGIS support would be interesting and cool, but is it necessarily the right direction to take? I know PO is already used by at least one municipality to hold photos relating to their tax rolls, but without a better idea of real-world workflows, I don't know what PO can do to better tie in to the rest of their (or anyone else's) systems.

Meanwhile, regardless of PO's support for PostGIS, more user-visible features like "pull up a google map with locations of this set of photos marked" can be implemented, and now that I have a GPS widget for my camera, I'm actually interested in such things. :)

I get nearly no feedback from PO users; indeed aside from the freshmeat subscriber stats I really have no idea how many folks actually use PO. My best efforts with Google show a few dozen public PO installations, including at least two which the admins have independently translated into Russian. Come on folks, send me patches so all users can benefit from this work!

So, peanut gallery, any thoughts?

Syndicated 2008-10-13 15:46:41 (Updated 2008-11-18 04:11:58) from Solomon Peachy

Photo Organizer 2.36 is (finally) out

It's been stuck in -rc status for four months. Much less feedback this time around, which can be attributed to less interest, or perhaps the code's been more robust this time around. We'll see.

There are many more user-visible changes than usual this time around, ihcluding a nice dark theme, pretty URLs, and per-folder/album thumbnails. Oh, and a 40x speed improvement on a hot-path sql query. Yikes.

Each release has made PO's internals less obnoxious and easier to change, but I've hit another brick wall and the next set of internal improvements will be pretty invasive, with no real user-visible benefit.

Unfortunately, development has slowed down considerably lately, in part due to RealLife(tm).. but as always, it's nice to get feedback.

I also just switched PO over to using git. Due to differences in the usage model (from svn), there was no easy way to migrate the old history in the same repo and still continue using git's best pracices. C'est la vie.

Syndicated 2008-08-18 00:41:35 from Solomon Peachy

Photo Organizer 2.35

Yeah, Photo Organizer 2.35 came out two weeks ago, but I'd figure I should toot my own horn a little bit.

A lot of work went into making client/event management more, well, manageable. Multi-day events and the ability to directly tie clients to events tie into date-based searching to make it easy to find out just what you took for any given point in time.

Also new is pluggable authentication, two-step registration, sortable folder/album listings, much (much) faster exporting, plus a large pile of under-the-hood changes to facilitate future features. Oh, and an Italian translation.

v2.35a will probably be released this week with a small pile of bugfixes. Most of these bugs were found while testing out changes made to the development trunk.

On that note, there are a lot of cool things in the pipeline for v2.36; the most visible of which is a new theme! Rickard Olsson got the ball rolling and contributed a dark theme, which I then mangled a bit and committed. When combined with pretty URLs and per-folder thumbnails, things look pretty slick. It's funny how sometimes just how effective superficial changes can be.

Syndicated 2008-02-20 03:00:46 from Solomon Peachy

More ES1 gutenprint goodness

Gutenprint has accepted my second patch, so it now has a working Selphy ES1 raster driver. Unfortunately, it still requires a custom print spooler, but I'm now one step closer.

es_print_assist.c is now updated to properly poll the printer status, so it can now take the raw dump from gutenprint and shove it out to the printer with minimal delay.

The third step will be to rework it so that it can deal with an arbitrary file on stdin, properly parsing the dumpfile to determine length and paper type.. and for step four, adapting it into a proper CUPS backend. Yay.

Syndicated 2007-11-24 00:15:28 from Solomon Peachy

One patch accepted, one more to go..

The fine folks behind Gutenprint accepted my patch to support the Canon Selphy ES series, but thanks to a boneheaded mistake on my part, what got committed didn't actually work. So there's a fixup patch pending.

The real fun, however, is the need to write a custom CUPS backend to properly spool data to the printer. I have a little helper app (es_print_assist.c) that batches the writes properly, but it dumbly waits instead of properly polling the printer for its status. CUPS is a lot more complicated to figure out than gutenprint, so further progress will be much slower.

Meanwhile, Photo Organizer 2.35 is coming along nicely; I'm at the point where I have to decide whether to go into -rc stabilization now, and save the next round of invasive changes for 2.36, or go ahead and make one or more of those changes now.

In particular, I want to be able to have PO auto-generate full-resolution JPEGs from the source RAW images. On the surface this is straightforward, but I want to implement this properly, by genericizing the "generate a down-scaled image and apply this set of transforms to it" code. This way additional sizes would be trivially easy to add, as would some of the changes I have in mind to make watermarking much more useful. Progress has been slow, but I'm almost done getting the low-level bits in place.

Anyway. Tons of stuff to do, never enough time..

Syndicated 2007-11-15 18:25:43 from Solomon Peachy

11 Nov 2007 (updated 11 Nov 2007 at 17:53 UTC) »

The joy of photo printers (and free software)

For some time now, I've wanted to pick up a compact photo printer to take with me on assignment, with the blessings of those I am taking photographs for. A little under two weeks ago, I finally did, purchasing a Canon SELPHY ES1.

It's a sweet little printer, using the old technique of dye-sublimation to create true continuious tone prints, rather than glorified halftoning that even the best inkjet printers use. Not only do the prints come out looking indistinguishable from what a photo lab would produce -- they're water- and smudge-proof.

I did my homework; apparently the majority of Canon's dyesub printers were supprted under Linux via the gutenprint drivers, but not the ES1 specifically. No big deal, it should just work. Even in the absence of direct Linux printing, I could print from the camera directly or shove a memory card into the printer. All in all, things should Just Work.

They didn't.

My first test involved taking a few converted-from-RAW JPEG images out of my archives, copying them to a CF card, and trying to print that. I got a rather crass Incompatible JPEG Format error message out of the printer. Interestingly my camera also errored out on those images, complaining that The image could not be displayed.. After some heavy digging it turns out the printer makes heavy use of the EXIF data, and if it's not present (or in many cases, simply modified!) the printer gives up. WTF? Why can't Canon document what it needs in a JPEG rather than just displaying a useless error message?

As I shoot RAW images, not being able to convert, crop, tweak, then print a random image via a CF card seriously sucked. So, I'll try Plan B: Print the images from the camera via the universal PictBridge interface.

No good.

Apparently my Nikon D200 camera can't print RAW images. WTF? Even if the camera could only print JPEGs, the NEFs have a full-res embedded JPEG image in the file that would print just fine. Sigh. Onto Plan C: Print directly from my laptop.

No good.

Apparently the SELPHY ES1 is incompatible with Canon's older dyesubs. To some extent I expected this, as it uses a different ribbon/dye pack, but that's mostly because the printer's physical engine is oriented differently -- and it's also why I bought this model over the others. Thanks to this incompatibility, I can't print from Linux either. Onto Plan D: Print from Windows. Surely that will work, right?

Sort of.

The printer worked just fine from Windows... but the prints were all quite dark. Too dark. After some digging, I found the driver's options panel and knocked the brightness up a few notches.. and while not perfect (yellow-ish color balance, mostly) the images were finally acceptable. But this would mean I'd need to boot into Windows to print, which really sucks as the rest of my RAW workflow is Linux-based.

Fortunately, the printer is USB-based, which means that thanks to a wonderful tool called Snoopy2, it's trivial to get a full dump of the entire communications chain between the printer and its driver. Armed with this dump, I could figure out the protocol and hack support into gutenprint.

After an initial learning curve, I succeeded. I was able to generate a binary dump indistinguishable from what Windows generated (except, of course, for the image data). So, cackling with glee, I proceeded to dump this out to the printer.

No good.

The device write() apparently blocked on the very first chunk of data. After much experimentation, I discovered that the logical chunks of data needed to be broken apart and written separately. The initialization sequence and the Yellow, Magenta, and Cyan image data all needed to have pauses between them (the printer sends a status message when it's ready) or the printer's USB interface locks up altogether. Sigh. So I split apart my dump file into its logical chunks, and dump them separately to the printer.

Success!

Not only did it print, but the brightness and color balance looked great. Yes, the images look much better than what their Windows driver manages to put out.

Ah, I love Free Software. When it doesn't JustWork(tm), you can fix it so it does.

All that remains is getting my patch

integrated into upstream gutenprint, and figuring out a way to intellently spool the printer data in a CUPS-compatible manner.

Oh, this was the first image I printed:

I took it last weekend at Paradise Beach. I have no idea who this guy is, but he was out kite-surfing on a windy but otherwise beautiful day.

Oh, as a footnote -- about a month before I ordered my ES1, Canon announced its successor models, the ES2 and ES20. Same basic specs, but when untethered the printers had fancier (and faster) feature sets. I needed a printer for next weekend (November 16-18) and nobody had a useful ETA on when they'd show up, so I bought the ES1 at a discount. On the 9th, four days after I received my ES1, everyone suddently got them in stock. Sigh.

Syndicated 2007-11-11 14:38:31 from Solomon Peachy

101 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!