Older blog entries for mjg59 (starting at number 375)

Using pstore to debug awkward kernel crashes

The problem with Samsung laptops bricking themselves turned out to be down to the UEFI variable store becoming more than 50% full and Samsung's firmware being dreadful, but the trigger was us writing a crash dump to the nvram. I ended up using this feature to help someone get a backtrace from a kernel oops during suspend today, and realised that it's not been terribly well publicised, so.

First, make sure pstore is mounted. If you're on 3.9 then do:

mount -t pstore /sys/fs/pstore /sys/fs/pstore

For earlier kernels you'll need to find somewhere else to stick it. If there's anything in there, delete it - we want to make sure there's enough space to save future dumps. Now reboot twice[1]. Next time you get a system crash that doesn't make it to system logs, mount pstore again and (with luck) there'll be a bunch of files there. For tedious reasons these need to be assembled in reverse order (part 12 comes before part 11, and so on) but you should have a crash log. Report that, delete the files again and marvel at the benefits that technology has brought to your life.

[1] UEFI implementations generally handle variable deletion by flagging the space as reclaimable rather than immediately making it available again. You need to reboot in order for the firmware to garbage collect it. Some firmware seems to require two reboot cycles to do this properly. Thanks, firmware.

comment count unavailable comments

Syndicated 2013-03-19 18:31:24 from Matthew Garrett

Supporting third-party keys in a Secure Boot world

It's fairly straightforward to boot a UEFI Secure Boot system using something like Shim or the Linux Foundation's loader, and for distributions using either the LF loader or the generic version of Shim that's pretty much all you need to care about. The physically-present end user has had to explicitly install new keys or hashes, and that means that you no longer need to care about Microsoft's security policies or (assuming there's no exploitable flaws in the bootloader itself) fear any kind of revocation.

But what about if you're a distribution that cares about booting without the user having to install keys? There's several reasons to want that (convenience for naive users, ability to netboot, that kind of thing), but it has the downside that your system can now be used as an attack vector against other operating systems. Do you care about that? It depends how you weigh the risks. First, someone would have to use your system to attack another. Second, Microsoft would have to care enough to revoke your signature. The first hasn't happened yet, so we have no real idea how likely the second is. However, it doesn't seem awfully unlikely that Microsoft would be willing to revoke a distribution signature if that distribution were being used to attack Windows.

How do you avoid that scenario? There's various bits of security work you need to do, but one of them is to require that all your kernel modules be signed. That's easy for the modules in the distribution, since you just sign them all before shipping them. But how about third party modules? There's three main options here:

  1. Don't support third party modules on Secure Boot systems
  2. Have the distribution sign the modules
  3. Have the vendor sign the modules

The first option is easy, but not likely to please users. Or hardware vendors. Not ideal.

The second option is irritating for a bunch of reasons, and a pretty significant one is license-related. If you sign a module, does that mean you're endorsing it in some way? Does signing the nvidia driver mean that you think there's no license concerns? Even ignoring that, how do you decide whose drivers to sign? We can probably assume that companies like AMD and nvidia are fairly reputable, but how about Honest John's Driver Emporium? Verifying someone's identity is astonishingly expensive to do a good job of yourself, and not hugely cheaper if you farm it out to a third party. It's also irritating for the driver vendor, who needs a separate signature for every distribution they support. So, while possible, this isn't an attractive solution.

The third option pushes the responsibility out to other people, and it's always nice to get other people to do work instead of you. The problem then is deciding whose keys you trust. You can push that off to the user, but it's not the friendliest solution. The alternative is to trust any keys that are signed with a trusted key. But what is a trusted key? Having the distribution sign keys just pushes us back to option (2) - you need to verify everyone's identity, and they need a separate signing key for every distribution they support. In an ideal world, there'd be a key that we already trust and which is owned by someone willing to sign things with it.

The good news is that such a key exists. The bad news is that it's owned by Microsoft.

The recent discussion on LKML was about a patchset that allowed the kernel to install new keys if they were inside a PE/COFF binary signed by a trusted key. It's worth emphasising that this patchset doesn't change the set of keys that the kernel trusts - the kernel trusts keys that are installed in your system firmware, so if your system firmware trusts the Microsoft key then your kernel already trusts the Microsoft key. The reasoning here is pretty straightforward. If your firmware trusts things signed by Microsoft, and if a bad person can get things signed by Microsoft, the bad person can already give you a package containing a backdoored bootloader. Letting them sign kernel modules doesn't alter the power they already have over your system. Microsoft will sign PE/COFF binaries, so a vendor would just have to sign up with Microsoft, pay $99 to Symantec to get their ID verified, wrap their key in a PE/COFF binary and then get it signed by Microsoft. The kernel would see that this object was signed by a trusted key and extract and install the key.

Linus is, to put it mildly, unenthusiastic about this idea. It adds some extra complexity to the kernel in the form of a binary parser that would only be used to load keys from userspace, and the kernel already has an interface for that in the form of X509 certificates. The problem we have is that Microsoft won't sign X509 certificates, and there's no way to turn a PE/COFF signature into an X509 signature. Someone would have to re-sign the keys, which starts getting us back to option (2). One way around this would be to have an automated service that accepts PE/COFF objects, verifies that they're signed by Microsoft, extracts the key, re-signs it with a new private key and spits out an X509 certificate. That avoids having to add any new code to the kernel, but it means that there would have to be someone to run that service and it means that their public key would have to be trusted by the kernel by default.

Who would that third party be? The logical choice might be the Linux Foundation, but since we have members of the Linux Foundation Technical Advisory Board saying that they think module signing is unnecessary and that there's no real risk of revocation, it doesn't seem likely that they'll be enthusiastic. A distribution could do it, but there'd be arguments about putting one distribution in a more privileged position than others. So far, nobody's stood up to do this.

A possible outcome is that the distributions who care about signed modules will all just carry this patchset anyway, and the ones who don't won't. That's probably going to be interpreted by many as giving too much responsibility to Microsoft, but it's worth emphasising that these patches change nothing in that respect - if your firmware trusts Microsoft, you already trust Microsoft. If your firmware doesn't trust Microsoft, these patches will not cause your kernel to trust Microsoft. If you've set up your own chain of trust instead, anything signed by Microsoft will be rejected.

What's next? It wouldn't surprise me too much if nothing happens until someone demonstrates how to use a signed Linux system to attack Windows. Microsoft's response to that will probably determine whether anyone ends up caring.

comment count unavailable comments

Syndicated 2013-02-27 14:38:33 from Matthew Garrett

Linux Foundation Secure Boot support released - what does it mean?

The UEFI bootloader that the Linux Foundation have been working on has just been released. That means we now have two signed bootloaders available - this one and shim.

Does this mean Linux distributions can now support Secure Boot?

They've actually been able to for a while. Ubuntu shipped with Secure Boot support last October, and Fedora shipped with Secure Boot support in January. Both used Shim rather than the Linux Foundation loader, and Shim's also being used by a variety of smaller distributions. The LF loader is a different solution to the same problem.

Is the Linux Foundation the preferred loader for distributions?

Probably not in most cases. One of the primary functional differences between Shim and the LF loader is that the LF loader is based around cryptographic hashes rather than signing keys. This means that the user has to explicitly add a hash to the list of permitted binaries whenever a distribution updates their bootloader or kernel. Doing that involves being physically present at the machine, so it's kind of a pain.

Why use it at all, then?

Being hash based means that you don't need to maintain any signing infrastructure. This means that distributions can support Secure Boot without having to change their build process at all. Shim already supports this use case (and some distributions are using it), but the LF loader has nicer UI for managing it.

Any other reasons?

Actually, yes. Shim implements Secure Boot loading in a less than entirely ideal way - it duplicates the firmware's entire binary loading, validation, relocation and execution code. This is necessary because the UEFI specification doesn't provide any mechanism for adding additional authentication mechanisms. The main downside of this is that the standard UEFI LoadImage() and StartImage() calls don't work under Shim. The LF loader hooks into the low-level security architecture and installs its own handlers, which means the standard UEFI interfaces work. The upshot is that you can use bootloaders like Gummiboot or efilinux without having to modify them to call out to Shim.

Why doesn't Shim do the same?

The UEFI architecture is slightly complicated. The UEFI specification itself defines the upper layers of the firmware, basically covering everything that UEFI applications and operating systems need. It doesn't define the lower layers of a UEFI implementation. Those are contained in the UEFI Platform Initialization spec, and that's what defines the security architecture interfaces that the LF loader hooks into. The problem is that it's completely valid to implement the UEFI specification without implementing the Platform Initialization specification, and if anyone does that then the LF loader will fail.

Can't you try both approaches?

Yes, and that's actually pretty much the plan now. I'm working on integrating the LF loader's UI and security code into Shim with the aim of producing one loader that'll satisfy the full set of use cases, and James is happy with this.

Which should I use?

Depends. If you want to support gummiboot (and aren't willing to patch it to call out to Shim), you'll need to use the LF loader. If you want to use key-based signing setups to avoid forcing re-enrolment on updates, you'll need to use Shim. If you're somewhere in the middle, you can probably use either. Once we've got the code merged, you won't have to make a choice.

comment count unavailable comments

Syndicated 2013-02-10 17:16:19 from Matthew Garrett

Samsung laptop bug is not Linux specific

I bricked a Samsung laptop today. Unlike most of the reported cases of Samsung laptops refusing to boot, I never booted Linux on it - all experimentation was performed under Windows. It seems that the bug we've been seeing is simultaneously simpler in some ways and more complicated in others than we'd previously realised.

So, some background. The original belief was that the samsung-laptop driver was doing something that caused the system to stop working. This driver was coded to a Samsung specification in order to support certain laptop features that weren't accessible via any standardised mechanism. It works by searching a specific area of memory for a Samsung-specific signature. If it finds it, it follows a pointer to a table that contains various magic values that need to be written in order to trigger some system management code that actually performs the requested change. This is unusual in this day and age, but not unique. The problem is that the magic signature is still present on UEFI systems, but attempting to use the data contained in the table causes problems.

We're not quite sure what those problems are yet. Originally we assumed that the magic values we wrote were causing the problem, so the samsung-laptop driver was patched to disable it on UEFI systems. Unfortunately, this doesn't actually fix the problem - it just avoids the easiest way of triggering it. It turns out that it wasn't the writes that caused the problem, it was what happened next. Performing the writes triggered a hardware error of some description. The Linux kernel caught and logged this. In the old days, people would often never see these logs - the system would then be frozen and it would be impossible to access the hard drive, so they never got written to disk. There's code in the kernel to make this easier on UEFI systems. Whenever a severe error is encountered, the kernel copies recent messages to the UEFI variable storage space. They're then available to userspace after a reboot, allowing more accurate diagnostics of what caused the crash.

That crash dump takes about 10K of UEFI storage space. Microsoft require that Windows 8 systems have at least 64K of storage space available. We only keep one crash dump - if the system crashes again it'll simply overwrite the existing one rather than creating another. This is all completely compatible with the UEFI specification, and Apple actually do something very similar on their hardware. Unfortunately, it turns out that some Samsung laptops will fail to boot if too much of the variable storage space is used. We don't know what "too much" is yet, but writing a bunch of variables from Windows is enough to trigger it. I put some sample code here - it writes out 36 variables each containing a kilobyte of random data. I ran this as an administrator under Windows and then rebooted the system. It never came back.

This is pretty obviously a firmware bug. Writing UEFI variables is expressly permitted by the specification, and there should never be a situation in which an OS can fill the variable store in such a way that the firmware refuses to boot the system. We've seen similar bugs in Intel's reference code in the past, but they were all fixed early last year. For now the safest thing to do is not to use UEFI on any Samsung laptops. Unfortunately, if you're using Windows, that'll require you to reinstall it from scratch.

comment count unavailable comments

Syndicated 2013-02-09 04:00:15 from Matthew Garrett

The Samsung laptop issue is not fixed

The recent Linux kernel commits avoid one mechanism by which Samsung laptops can be bricked, but the information we now have indicates that there are other ways of triggering this. It also seems likely that it's possible for a userspace application to cause the same problem under Windows. We're still trying to figure out the full details, but until then you're safest ensuring that you're using BIOS mode on Samsung laptops no matter which operating system you're running.

comment count unavailable comments

Syndicated 2013-02-07 18:41:23 from Matthew Garrett

Don't like Secure Boot? Don't buy a Chromebook.

People are, unsurprisingly, upset that Microsoft have imposed UEFI Secure Boot on the x86 market. A situation in which one company gets to determine which software will boot on systems by default is obviously open to abuse. What's more surprising is that many of the people who are upset about this are completely fine with encouraging people to buy Chromebooks.

Out of the box, Chromebooks are even more locked down than Windows 8 machines. The Chromebook firmware validates the kernel, and the kernel verifies the filesystem. Want to run a version of Chrome you've built yourself? Denied. Thankfully, Google have provided a way around this - you can flip a physical switch hidden somewhere in the machine to disable the validation. Doing so deletes all your data in the process, in order to avoid the situation where a physically present attacker wants to steal your data or backdoor your system unnoticed, but after that it'll boot any OS you want. The downside is that you've lost the security that you previously had. If a remote attacker manages to replace your kernel with a backdoored one, the firmware will boot it anyway. Want the same level of security as the stock firmware? You can't. There's no way for you to install your own signing keys, and Google won't sign third party binaries. Chromebooks are either secure and running Google's software, or insecure and running your software.

Much like Chromebooks, Windows 8 certified systems are required to permit the user to disable Secure Boot. In contrast to Chromebooks, Windows 8 certified systems are required to permit the user to install their own keys. And, unlike Google, Microsoft will sign alternative operating systems. Windows 8 certified systems provide greater user freedom than Chromebooks.

Some people don't like Secure Boot because they don't trust Microsoft. If you trust Google more, then a Chromebook is a reasonable choice. But some people don't like Secure Boot because they see it as an attack on user freedom, and those people should be willing to criticise Google's stance. Unlike Microsoft, Chromebooks force the user to choose between security and freedom. Nobody should be forced to make that choice.

comment count unavailable comments

Syndicated 2013-02-04 16:58:42 from Matthew Garrett

The current state of UEFI and Linux

Executive summary: Most things work fine.

Things we know are broken:

  • Some Samsung laptops. The samsung-laptop driver is a slightly weird thing. By 2010 (when it first appeared) most vendors had moved over to using some level of firmware abstraction, either using ACPI or WMI. Samsung still seemed to be stuck around a decade earlier - they were providing a region of memory at a known address, and you'd read that address to find a bunch of offsets. Then you'd write magic values based on those offsets to magic system IO ports based on those offsets and something would happen. Those writes were triggering System Management Mode, a special x86 CPU mode where the processor executes code from memory that the OS can't see, without telling the OS that it's doing so. There's nothing especially new in this (SMM first appeared in the 386sl back in 1990), but it also means that you depend on the system vendor not changing the interface without telling you. Turns out that Samsung apparently changed their platform interface when they moved to UEFI, but didn't actually do anything to prevent old drivers from breaking things - performing exactly the same series of accesses on some modern Samsung laptops gives an uncorrectable machine check exception (in the best case) or destroys your firmware (in the worst case). Given that the driver was written to Samsung's specifications, this is pretty obviously Samsung's fault, but that's probably little consolation to anyone who ended up with a dead laptop. Although, given Samsung's track record, this may not be surprising.

    On the bright side, some of the machines that are affected by this predate Secure Boot, so at least it's not a Secure Boot bug.
  • Some Toshibas won't boot Linux. This turns out to be some staggering incompetence on the part of Toshiba (or, more likely, their third-party vendor) - they managed to leave the signing key out of the database that's used to validate binaries, and managed to leave the signature database signing key out of the database that's used to provide whitelist or blacklist updates. The good news is that this is a blatant violation of Microsoft's Windows 8 certification guidelines, and that seems to have encouraged Toshiba to actually fix their BIOS. The bad news is that any of the affected machines that are currently available are still broken, and Toshiba don't seem to be willing to actually give you the firmware update yet.
  • Some Lenovos will only boot Windows or Red Hat Enterprise Linux. I recommend drinking, because as far as I know they haven't actually got around to doing anything useful about this yet.

Not an amazingly positive list, but as far as I know that's about it - other than some Samsungs, one range of Toshibas and one range of Lenovo desktops, Linux should be fine. If you have any other UEFI system that's unable to install Fedora 18, let me know and we'll do our best to work out what's going on.

comment count unavailable comments

Syndicated 2013-02-01 05:48:37 from Matthew Garrett

ITWire still has problems with basic accuracy

ITWire ran a story on the 12th of January entitled "Fedora still has problems with secure boot". It ends up discussing two issues with the author's experience with the Fedora installer - that he can't reclaim any free space on existing drives, and that the installer didn't automatically add entries for Windows to the grub menu. These are certainly legitimate issues, and I don't want to suggest that it's reasonable for people to have to manually alter their configuration in order to support dual boot. But they're not issues with Fedora's support for secure boot, despite the enthusiasm with which certain people have jumped on the story. We've received one credible report of a secure boot related problem with Fedora on a couple of Toshiba laptops, which appears (and I want to stress that we're still working on diagnosing it) to be a firmware bug rather than any kind of problem with Fedora.

Mistakes happen in journalism, and sometimes there are differences of opinion. But this story is simply wrong. When asked about it in the comments, the author failed to support his position. When contacted, the editor in chief was willing to add a note saying that I disputed the arguments but was unwilling to remove the incorrect claims. As a result, the internet remains full of links and reposts of an article that unashamedly tells users that the current Linux distribution with the best UEFI hardware support has issues with something it has no issues with.

For reasons that are unclear to me, ITWire seems to have some sort of well regarded status in the Australian technical industry. It seems entirely undeserved.

comment count unavailable comments

Syndicated 2013-01-22 21:29:24 from Matthew Garrett

Upcoming conferences

Unfortunately I'm not going to make it to LCA this year, but there's still plenty of UEFI coverage - Dong Wei, Vice President of the UEFI forum, will be giving an introduction to UEFI, and James Bottomley, who's been working on the Linux Foundation's UEFI bootloader solution, will be talking about Secure Boot.

For those of you on this side of the Pacific (or who just can't get enough of listening to my melodious voice), I'll be keynoting the South California Linux Expo next month. This will be a discussion of the more social aspects of our work on Secure Boot over the past year and a half - an epic tail of astonishment, terror, politics, confusion and hard won victories, and the remaining issues that are yet unsolved. Clearly not to be missed. If California's too far for you, then come March I'll be at the North East Linux Fest. I'll be talking about how to make sure that your Linux distribution works with Secure Boot, and, assuming I've got the code finished by then, a little about how to break it.

comment count unavailable comments

Syndicated 2013-01-20 04:02:24 from Matthew Garrett

Experiences with depression

I've been lucky in life. I've known several people who've killed themselves, but I've been friendly with them rather than friends with them. Their deaths have saddened me, but haven't left giant gaping holes that I have trouble imagining filling. I've also been lucky in that I'm not especially predisposed towards depression, and even luckier that the one serious episode I had ended well. That luck means I'm not qualified to speak authoritatively on the topic, but this seems like a good time to share.

Most of 2002 wasn't great for me. Many of my friends had graduated and moved away. My PhD, which was what I'd spent the past few years of my life working towards, was going horribly wrong. The girl I liked didn't like me. Lots of individual small things that didn't fundamentally matter, but which build up into overall feelings of isolation and failure at life. I was spending an increasing number of days not leaving bed and not talking to people. It wasn't that I couldn't have fun - socialising was still pleasurable, and I didn't actively avoid people, but it was always tempered with the knowledge that this was temporary and I'd be returning to the unhappiness afterwards. I couldn't see any sequence of events that would turn my life around and restore my happiness. This was how life was, and this was how life was going to be. I'd irrevocably fucked up, and this was my future.

Looking back, what strikes me is how reasonable this seemed. I could point at specific things that were making me unhappy, and nobody could argue that it was unreasonable to be unhappy about them. There wasn't any point in speaking to people about it, because what would they be able to do except agree that I was justifiably unhappy? I thought about suicide. Not seriously, because overall life still seemed worthwhile, but I could conceive of a level of further unhappiness that would make it seem like the best choice. I don't think it would ever have occured to me to speak to someone about it first. It seemed like it would be the same argument - I'm justifiably unhappy, I already feel like I'm letting my friends down, what could they do other than tell me that my feelings are wrong or make me feel even more guilty? So, when I saw exhortions for people to speak to someone if they felt suicidal, it seemed like they were talking to people who hadn't thought this through as well as me. It felt like I'd thought this all through carefully and rationally and come to a completely logical decision. If changes in circumstances and further consideration made it seem like suicide was the better choice, that would be because it was the better choice. Maybe other people weren't thinking about this as logically as I was. Maybe they'd have their minds changed by speaking to a friend or a professional. I wouldn't. Of course, with hindsight I was rationalising the way I already felt rather than making entirely rational decisions. I could have rationalised myself to death even though there were (in my case) straightforward ways to make my life better.

In any case, I've no idea how close I ever got to that point. Things were at their worst in August - by September I had a new job and new house, and things just got better from there. In the end, the friends I was convinced could do nothing for me ended up giving me the opportunity to find gainful employment and made sure I had somewhere to live. Without them, things might have been different. As it was, I spent less than nine months depressed and it was still the most hellish experience of my life. The thought of returning to that state is terrifying. I was lucky. I might not be again.

There's no terribly good moral here. If I'd known more about depression beforehand, I might have been able to identify what was happening to me and seek professional help. Other than that, I didn't learn anything about how to avoid or deal with depression. The experience didn't make me a better person. I've no advice for people in the same situation. The only thing I gained from it all was the realisation that if I'd felt any worse and knew that this was what I faced for the rest of my life, death might not have been the worst choice I had.

Depression is a huge social problem and we deal with it badly. We refuse to talk about it, and when we do talk about it we mostly limit ourselves to platitudes about how things will get better or placing the blame on depressed people for not wanting to talk to those around them. Sometimes it doesn't get better. Sometimes talking to those around you will make things worse. People need to be aware of what the effects of depression are and get better at identifying it in others, rather than assuming that they'll be able to ask for help themselves. Society as a whole needs to be better at ensuring that professional support is there for people who need it. And, unless we can make massive improvements in our understanding of the causes of depression and effective mechanisms for countering it, we need to accept that it will cost us friends. Let's redirect the anger we feel at their choice to avoid a lifetime of misery into anger at the society that still hasn't done everything it can to help them.

comment count unavailable comments

Syndicated 2013-01-17 15:45:50 from Matthew Garrett

366 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!