Recent blog entries for berend

Just had an ancient HP ProLiant dying on me: simultaneous failure of a memory bank, and a hard disk. Luckily hard disks were mirrored, phew.

As it was urgent, I bought a server I could pickup: an IBM Express x3100 M4. Warning: do not get this server. On a good day the thing takes 5 minutes to get to disk boot! Very time consuming to get that going. Also needed to apply an urgent firmware update, but no clue how that worked. Tried to run the IBM UpdateXPress ibm_utl_uxspi tool, but somehow that didn't seem to want to work on Ubuntu. Tried everything to get that going, booting OpenSUSE disk, booting SUSE EnterPrise Server 11 rescue disk, simply didn't want to work.

In the end I discovered that I had to use the IBM Bootable Media Creator utility. Doesn't run on Ubuntu. So installed a supported OS, SUSE Linux Enterprise Server 11 via my VMWare Workstation tool, and could then run BoMC. Once you know how updating firmware works, it works quite well, but boy is this hard to figure out. Took me a day of trying things, before I hit on this. And five minute boot times don't help. The IBM readme's are written by lawyers, not for people in a hurry.

This IBM X Server is apparently nice with something called Integrated Management Module II, but couldn't get that going, because I needed an activation key which I didn't have for some reason. I'll email IBM support, see if helps.

Pity I was in a hurry else I would have gotten a System76 or IX Systems box, they just work without getting in the way.

Always had lots of issues playing back HD content over minidlna or mediatomb to basically any player, i.e. my Android tablet/phone or chromecast: just stuttering all the time. First thought it may have been BubbleUPnP, or MX Player, but once I got a phone that had 5G I could confirm that the problem disappeared on 5G. Had read people saying it was WIFI interference, but didn't believe them as it didn't appear the case when checking WIFI networks with WIFI Analyzier. But what do I know, once I switched my channel from 1 to 6 all my issues disappeared. Just weird, performance of channel 1 was usually not a bottleneck (could download at 9MBits easily), but probably there was indeed enough congestion to cause stutter. Very happy to have this fixed, makes DLNA so much more enjoyable.

Weird issue with Apache 2.4 and the geoip module: geoip didn't properly detect the remote address, even though the remoteip module was enabled and working. Had to recompile the geoip module to get it too work. Bug in earlier version of issue with Ubuntu 14.04 perhaps.

16 Nov 2015 (updated 3 Dec 2015 at 00:55 UTC) »

Weirdest error today with printing. Upgraded my FreeBSD (source recompiled of 10.2p6), after that wanted to print from my recently upgraded Ubuntu 15.10 desktop. Didn't want to print, got: "Filter failed" in cups jobs screen. First I thought it was due to the FreeBSD upgrade, but after a while I figured out an Ubuntu 12.04 laptop still printed fine, as well as a PC-BSD 10.2 laptop.

Error messages on FreeBSD cups server were not very helpful, with things like:

(/usr/local/libexec/cups/filter/rastertopdf) stopped with status 1.
Or
prnt/hpcups/HPCupsFilter.cpp 530: cupsRasterOpen failed, fd = 0
Anyway, after I discovered it must be my recently upgraded Ubuntu 15.10, and I probably hadn't tried to print after that, tried to remove and reinstall the printer. That didn't help. The final magic was a suggestion:
lpadmin -p HP-LaserJet-cm1415fnw -m raw
After that thinks worked, weird, weird, weird.

Continually see this in svn 1.8.11 server log currently:

Provider encountered an error while streaming a REPORT response.  [500, #0]
A failure occurred while driving the update report editor [500, #103]

Get this when doing a checkout/update. Client says:

svn: E120190: Error retrieving REPORT: An error occurred during authentication

Many reports of same problem over many years over the internet. No solution.

The only thing that works is svn 1.6 client. Go figure.

Setup is apache 2.2 server, wandisco latest 1.8 svn, and svn repository is nfs mounted. But that doesn't matter, even when making the repository local it doesn't work.

Upgraded my Ubuntu 14.04 Trusty Tahr to the WanDisco 1.8.11 from 1.8.8, doesn't help.

Another update on the Linux NFS server problems I had: massively high i/o by process jbd2/xvda1-8.

Problems came back. So switched from paravirtualisation to hvm. And initially that appeared to work.

But now, any time we launch another php5-fpm server, immediately jbd2/xvda1-8 jumps to it's max. Tearing my hear out.

5 Jul 2014 (updated 20 Aug 2014 at 23:53 UTC) »

An update on the Linux NFS server problems I had: massively high i/o by process jbd2/xvda1-8.

The initial fix I settled on was perhaps a file system corruption issue. That didn't work actually, even though initially it looked like it did.

Then I thought it maybe a Linux kernel. But even that didn't permanently fix it.

Issue just appeared again: added one more php5-fpm server to the mix, an identical copy of another one, and bang problem came back.

My latest solution: I was using an lvm2 stripe, switched to an md stripe. Seems to work so far.

Update: didn't work, see http://www.advogato.org/person/berend/diary/401.html

26 Jun 2014 (updated 26 Jun 2014 at 23:33 UTC) »

Suffered from very slow scp to a FreeBSD server. rootbsd said the hpn patch should help, but that it was already applied in FreeBSD 10. That was the pointer I needed.

By default FreeBSD has some kind of scp transfer performance improvement patch:


# Disable HPN tuning improvements.
#HPNDisabled no


When I change this to:

HPNDisabled yes

My scp performance jumped from about 300KB/s to 8MB/s in this particular instance. Now applying this to all my FreeBSD boxes, on my local network performance jumped from 58MB/s to 71MB/s.

I think this setting should be disabled!

As an update to my last post on weird Ubuntu 12.04 NFS4 server load: repairing the ext file system actually didn't really work.

Redid the test, problems came right back.

Next thing I did was turn the root volume into xfs: better, but still writing 1MB/s, with 50% i/o utilisation for the root disk.

So perhaps a Linux kernel things. Created a 13.10 nfs server, problem disappeared.

28 Mar 2014 (updated 3 Apr 2014 at 03:28 UTC) »

Really weird issue yesterday trying to move a customer to AWS. Testing was all fine, but when we switched the ip address, the system grinded to a halt.

The cause was the NFS server, which became unresponsive, so web and php5 farms stopped. Using iotop I found out that this was caused by the jdb2 process, jbd2/xvda1-8 in my case. jdb2 basically was at 100% i/o. Initially I thought perhaps the instance was faulty, so build a new NFS server (simply replaying my ansible script). Got exactly same behaviouron the new server, all i/o grind to a halt as jbd2 took over as soon as I did even simple things like checking out a Drupal repository (so single client, doing an svn co).

But why would jbd2 kick in? With iostat -x 1 I determined that we were writing 5MB/s to the root file system. That made no sense. There is nothing on this NFS server that would do that. All data is on separately mounted EBS disks. The root file system is ext4, but all the other files systems were xfs! And the clients only mount the xfs file systems.

Using a suggestion to debug what's going on, I tried:

echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable

Waited for a minute and then did:

cat /sys/kernel/debug/tracing/trace

Got a lot of lines like:

nfsd-943   [000] 8559086.521147: ext4_sync_file_enter: dev 202,1 ino 30703 parent 30666 datasync 0
nfsd-942 [000] 8559086.527871: ext4_sync_file_enter: dev 202,1 ino 30703 parent 30666 datasync 0

OK, clearly the NFS daemon causes a lot of datasync() calls. But why would this have an effect on the root file system?

After more googling I found this comment:
Problem vanished after fsck'ing my ext4 partitions

Huh? Worth a try. Stopped NFS server, mounted root disk on another server, ran fsck:

# fsck /dev/xvdf
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
cloudimg-rootfs: clean, 62399/524288 files, 378952/2097152 blocks

and reattached. Problem solved!!

I have no explanation for this behaviour, except that maybe the latest Ubuntu 12.04 LTS AMI has a bad disk.

PS: I now believe I was wrong, see this update.

397 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!