[personal profile] mjg59
The Linux kernel lockdown patches were merged into the 5.4 kernel last year, which means they're now part of multiple distributions. For me this was a 7-year journey, which means it's easy to forget that others aren't as invested in the code as I am. Here's what these patches are intended to achieve, why they're implemented in the current form and what people should take into account when deploying the feature.

Root is a user - a privileged user, but nevertheless a user. Root is not identical to the kernel. Processes running as root still can't dereference addresses that belong to the kernel, are still subject to the whims of the scheduler and so on. But historically that boundary has been very porous. Various interfaces make it straightforward for root to modify kernel code (such as loading modules or using /dev/mem), while others make it less straightforward (being able to load new ACPI tables that can cause the ACPI interpreter to overwrite the kernel, for instance). In the past that wasn't seen as a significant issue, since there were no widely deployed mechanisms for verifying the integrity of the kernel in the first place. But once UEFI secure boot became widely deployed, this was a problem. If you verify your boot chain but allow root to modify that kernel, the benefits of the verified boot chain are significantly reduced. Even if root can't modify the on-disk kernel, root can just hot-patch the kernel and then make this persistent by dropping a binary that repeats the process on system boot.

Lockdown is intended as a mechanism to avoid that, by providing an optional policy that closes off interfaces that allow root to modify the kernel. This was the sole purpose of the original implementation, which maps to the "integrity" mode that's present in the current implementation. Kernels that boot in lockdown integrity mode prevent even root from using these interfaces, increasing assurances that the running kernel corresponds to the booted kernel. But lockdown's functionality has been extended since then. There are some use cases where preventing root from being able to modify the kernel isn't enough - the kernel may hold secret information that even root shouldn't be permitted to see (such as the EVM signing key that can be used to prevent offline file modification), and the integrity mode doesn't prevent that. This is where lockdown's confidentiality mode comes in. Confidentiality mode is a superset of integrity mode, with additional restrictions on root's ability to use features that would allow the inspection of any kernel memory that could contain secrets.

Unfortunately right now we don't have strong mechanisms for marking which bits of kernel memory contain secrets, so in order to achieve that we end up blocking access to all kernel memory. Unsurprisingly, this compromises people's ability to inspect the kernel for entirely legitimate reasons, such as using the various mechanisms that allow tracing and probing of the kernel.

How can we solve this? There's a few ways:
  1. Introduce a mechanism to tag memory containing secrets, and only restrict accesses to this. I've tried to do something similar for userland and it turns out to be hard, but this is probably the best long-term solution.
  2. Add support for privileged applications with an appropriate signature that implement policy on the userland side. This is actually possible already, though not straightforward. Lockdown is implemented in the LSM layer, which means the policy can be imposed using any other existing LSM. As an example, we could use SELinux to impose the confidentiality restrictions on most processes but permit processes with a specific SELinux context to use them, and then use EVM to ensure that any process running in that context has a legitimate signature. This is quite a few hoops for a general purpose distribution to jump through.
  3. Don't use confidentiality mode in general purpose distributions. The attacks it protects against are mostly against special-purpose use cases, and they can enable it themselves.

My recommendation is for (3), and I'd encourage general purpose distributions that enable lockdown to do so only in integrity mode rather than confidentiality mode. The cost of confidentiality mode is just too high compared to the benefits it provides. People who need confidentiality mode probably already know that they do, and should be in a position to enable it themselves and handle the consequences.

Live Kernel Patching

Date: 2020-04-21 10:00 pm (UTC)
From: (Anonymous)
Have you taken into account Live Kernel Patching? It sounds like Kernel Lockdown will break or prevent live kernel patching.

Benefit of the integrity mode

Date: 2020-04-22 03:35 am (UTC)
From: (Anonymous)
Forgive me if you´ve been asked this question before, but what is the actual bebefit of having lockdown with the integrity mode?

Root can not modify the kernel anymore, but it still can modify the userland. How does it help you that the kernel is pristine if your shell is compromised?

What about hibernation in the lockdown mode?

Date: 2020-04-22 11:25 am (UTC)
From: (Anonymous)
I really wanted to use the lockdown mode, but it doesn't allow my machine to hibernate. Are there any plans to address this issue? I'm asking because I use LUKS+LVM setup, and the SWAP partition is encrypted.

Also, since I can't use the lockdown mode in the current form, is there a way to configure the kernel so it would look like as if it was in the lockdown mode? For instance setting the following kernel options to off:

CONFIG_DEVMEM:
CONFIG_DEVKMEM:
CONFIG_DEVPORT:
CONFIG_ACPI_APEI_EINJ:
CONFIG_X86_MSR:
CONFIG_KEXEC:
CONFIG_KEXEC_FILE:
CONFIG_ACPI_CUSTOM_METHOD:
CONFIG_PROC_KCORE:
CONFIG_KPROBES:
CONFIG_EFI_TEST:
CONFIG_MMIOTRACE:
CONFIG_DEBUG_KERNEL:

Are there other options that can be set/unset to achieve what the lockdown mode does?

another integrity solution

Date: 2020-04-22 12:51 pm (UTC)
From: (Anonymous)
Hi all,

I wonder if a 2nd physical CPU which can monitor the 'main' cpu might be a MUCH better solution. Kinda like and 'executive' CPU that notices strange things about the main CPU. Parts when scanned have been modified. CPU usage has gotten bigger.

In many smaller systems I put and executive module in to monitor the 'state' of the various system modules. When a module is in a funny state, or in a transient state for too long, the 'executive' kills it.

Given a 2nd and 3rd CPU there might even be chance to have some real encryption done by the assembled system?

Lots of fun.

Undervolting is impossible with lockdown enabled

Date: 2020-04-22 04:16 pm (UTC)
From: (Anonymous)
I build custom kernels, and short of patching the lockdown feature to allow modification of /dev/msr it's impossible to actually undervolt the cpu...

Other than that I use it on my general purpose machines without trouble.

read-only Volumes

Date: 2020-04-27 01:31 pm (UTC)
From: (Anonymous)
Hi, lets assume the kernel lockdown is active and runs in integrity/confidentiality mode. Even the filesystem is splitted into several volumes (/, /var, /boot, /home e.g.), and they are mounted with different (ro, noexec e.g.) options. Is it possible for root to remount any read-only mounted volume? It would be great if an lockdown could prevent (and safe immutable bits too) that.

kernel_lockdown.7 man page?

Date: 2020-05-06 05:58 pm (UTC)
From: (Anonymous)
On my Ubuntu 20.04 laptop I've enabled 'integrity' mode -- it's remarkable how many web articles mention kernel lockdown without telling you how to turn it on -- and see

[ 537.405854] Kernel is locked down from securityfs; see man kernel_lockdown.7

in dmesg, but I can't seem to find that man page. It's not in latest man-pages repo, and Ubuntu didn't ship anything for it.

Just being afraid

Date: 2020-05-25 07:38 pm (UTC)
From: (Anonymous)
I am afraid that, ultimately, this thing will be employed against the user to prohibit them from running their code on their devices.

Shame on Google for not allowing the user to root Android devices easily. In the past it has been much simpler then now.

Migration of features from debugfs

Date: 2021-06-16 02:47 am (UTC)
From: (Anonymous)

Unfortunately right now we don't have strong mechanisms for marking which bits of kernel memory contain secrets,

By analogy the block on debugfs in integrity mode doesn't take into account the security or integrity status of individual debugfs nodes. Is there a project to look at how to define a level of safety on these interfaces, or to migrate them from debugfs to other parts of /sys?

I've just come up with scenarios were we are developing external hardware, but unable to toggle dynamic_debug messages in debugfs. Disabling lockdown and/or secure boot or avoiding distribution kernels seem to be an overkill solution. In general out-of-the box desktop environments with usual security are ideal for our developers - except in some of these frustrating edge-cases. I'm sure in a month I'll find another one ;)

Profile

Matthew Garrett

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at Aurora. Ex-biologist. [personal profile] mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer. Also on Mastodon.

Expand Cut Tags

No cut tags