19 Mar 2013 mjg59   » (Master)

Using pstore to debug awkward kernel crashes

The problem with Samsung laptops bricking themselves turned out to be down to the UEFI variable store becoming more than 50% full and Samsung's firmware being dreadful, but the trigger was us writing a crash dump to the nvram. I ended up using this feature to help someone get a backtrace from a kernel oops during suspend today, and realised that it's not been terribly well publicised, so.

First, make sure pstore is mounted. If you're on 3.9 then do:

mount -t pstore /sys/fs/pstore /sys/fs/pstore

For earlier kernels you'll need to find somewhere else to stick it. If there's anything in there, delete it - we want to make sure there's enough space to save future dumps. Now reboot twice[1]. Next time you get a system crash that doesn't make it to system logs, mount pstore again and (with luck) there'll be a bunch of files there. For tedious reasons these need to be assembled in reverse order (part 12 comes before part 11, and so on) but you should have a crash log. Report that, delete the files again and marvel at the benefits that technology has brought to your life.

[1] UEFI implementations generally handle variable deletion by flagging the space as reclaimable rather than immediately making it available again. You need to reboot in order for the firmware to garbage collect it. Some firmware seems to require two reboot cycles to do this properly. Thanks, firmware.

comment count unavailable comments

Syndicated 2013-03-19 18:31:24 from Matthew Garrett

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!