Pre-RFC: TrustZone support on AArch64

The purpose of this thread is to discuss what TrustZone support for the AArch64 seL4 kernel might look like, and whether that’s worth the cost.

TrustZone background

The security of a TrustZone-enabled system is achieved by partitioning all of the SoC’s hardware and software resources so that they exist in one of two worlds - the secure world for the security subsystem, and the normal world for everything else. From the point of view of an ARMv8-A processor, the system has two physical address spaces: one for secure transactions and another for non-secure transactions. Page table entries contain a field called the NS (non-secure) bit, which determines whether a page is mapped from the secure or non-secure physical address space. Furthermore, the processor state contains a global NS bit, which can only be modified by software executing at the highest exception level, EL3. When the NS processor state bit is set, the processor is said to be executing in the non-secure state, and can’t issue secure bus transactions, regardless of the values of the NS bits in its translation tables.

A typical TrustZone software stack hosts a TEE (Trusted Execution Environment) on secure-world resources, and a REE (Rich Execution Environment) on normal-world resources. A so-called “secure monitor” runs in EL3, and uses the NS processor state bit to implement a coarse context switch called a “world switch”. The simplicity of this coarse context switch minimizes the attack surface of the secure monitor.

seL4 + TrustZone

seL4 would be useful as a secure-world operating system (known as a Trusted Operating System or TOS) running in secure-EL1 (S-EL1), as a secure-world hypervisor (known as a Secure Partition Manager or SPM) running in S-EL2, or as a secure-world guest operating system running in S-EL1 on top of an SPM.

In any case, the AArch64 kernel would have to be extended with the distinction between the secure and non-secure physical address spaces so that it could construct translation structures with mappings from both address spaces, and so that it could enforce a policy which restricts kernel resources to secure memory.

Goals of this thread

  • Survey the applications of TrustZone support for seL4 on AArch64, in order to determine the value of this proposed extension.
  • Identify stakeholders.
  • Explore the range of possible implementations of this proposed extension.
  • Understand the costs to existing configurations of the kernel.
1 Like

I think that a first pass over some implementation ideas would be a good way to start this discussion. This proposed extension might entail no more than a small patch. If this were true, that would affect the context for the rest of the discussion.

TrustZone logically extends physical addresses with the NS bit. Physical addresses on ARMv8-A are only 48 bits, so it’s sound to actually extend physical addresses at the kernel API level with the NS bit (now as the 49th bit of “augmented” physical addresses). With this approach, seL4 could run in S-EL1 as a TOS without any API changes.

Internally, we would need to extend the kernel’s internal mappings to map both physical address spaces. Furthermore, we would need to implement a policy which restricts kernel resources to secure memory. Logically, untyped memory, which currently distinguishes between kernel memory and device memory, would be augmented with an axis distinguishing between secure and non-secure memory. Non-secure kernel memory would be subject to similar restrictions as device memory.

In order to understand the scope of this proposed implementation for the S-EL1 kernel, I’ve drafted a patch [1]. It doesn’t deal with conditional compilation, and it’s also not quite complete. Nevertheless, at ++202/--79, it does illustrate that this approach doesn’t require much change.

We hit a bit of a snag at S-EL2 (i.e. adding support for running seL4 as a secure-world hypervisor). In AArch64, stage-1 and stage-2 translation tables are actually distinct structures. Currently the AArch64 seL4 kernel does not make this distinction, neither internally nor in the API. This distinction comes into play at S-EL2, because stage-2 translation tables lack the NS bit. Instead, a translation context consists of two stage-2 tables, one for secure IPAs and one for non-secure IPAs [2]. So, to support S-EL2 may require a change or conditional extension to the AArch64 hypervisor API. An API change is a big deal, and would be difficult to justify with such a niche application alone. However, such an API change wouldn’t necessarily be S-EL2-specific.

[1] https://gitlab.com/arm-research/security/icecap/sel4/-/tree/nspin/wip/trustzone
[2] https://developer.arm.com/architectures/learn-the-architecture/aarch64-virtualization/secure-virtualization

There was a brief question about this topic previously: Secure mode seL4 builds for ARM

I’m trying to understand why any changes are needed to the kernel. Untypeds can be sorted into secure/insecure by the init process, which hands secure untypeds only to things running in the secure world?

For now, let’s focus on the (Secure-)EL1 kernel. The code implementing AArch64 virtual memory management (mostly in src/arch/arm/64/kernel/vspace.c) would require modification to support the NS bit in page table entries, both in its own virtual address space and in userspace virtual address spaces [1]. The kernel would map the secure physical address space into its own virtual address space. It would also set the NS bit in each userspace page table entry according to whether the frame being mapped belongs to the secure or non-secure physical address space.

At a higher level, if we wanted to place restrictions on non-secure memory similar to the existing restrictions on device memory, we would have to extend the relevant logic in, for example, the Untyped_Retype handler.

[1] This patch suggests one such modification:
https://gitlab.com/arm-research/security/icecap/sel4/-/commit/044096bc9b51dfd397b09851ab09209002d1bb89#f03944faac7874cfabc61e6243fafb8e2eb683a5

I think adding kernel support for running in SEL1 is worth pursuing. My understanding is that some hardware platforms don’t allow some devices to ever be accessed from the non-secure world and would require a secure world kernel even if the secure/non-secure partitioning mechanisms aren’t needed.

Logically, untyped memory, which currently distinguishes between kernel memory and device memory, would be augmented with an axis distinguishing between secure and non-secure memory. Non-secure kernel memory would be subject to similar restrictions as device memory.

Wouldn’t non-secure kernel memory be subject to the same restrictions as device memory? As kernel memory has confidentiality and integrity requirements, when seL4 is operating in SEL1, non-secure memory can no longer be used as there aren’t hardware mechanisms that prevent a non-secure OS from accessing the memory from EL2 or EL1 right? However, if there was no non-secure OS ever running then NS memory could then be used for kernel memory in a secure-world kernel?

We hit a bit of a snag at S-EL2

Are all references to S-EL2 talking about what is introduced in Armv8.4-A?

In your seL4 summit presentation, you claimed that by running in S-EL2 it was possible to stay in NS=0 and use seL4 for all separation protections. Does this also assume that there isn’t any software running in non-secure EL2?

I apologize for the delayed response.

I propose that non-secure non-device memory should be subject to the same restrictions as device memory with the exception that it can be used for IPC buffers. This exception would enable a protection domain to run with an address space that only maps non-secure memory. If seL4 were to be used just as a trusted OS in the secure world in a traditional TrustZone firmware stack, there would be no need to enable protection domains running entirely in non-secure memory. However, in a firmware stack like the one I described in my seL4 Summit talk, it would be nice. I can’t think of any confidentiality of integrity implications of allowing non-secure non-device memory to be used as IPC buffers.

Even if there is no non-secure OS running alongside seL4, you would still want to keep the kernel entirely within secure memory. Unlike secure memory, non-secure memory isn’t protected from non-secure devices on the system.

Yes.

Yes. The idea is for seL4 to use translation tables alone to protect the secure world, in contrast to a typical TrustZone firmware stack where an unverified EL3 monitor uses the NS processor state bit to do so. In the seL4-based firmware stack, the non-secure world no runs on top of seL4 (rather than just on top of the EL3 monitor) with NS=0, but with only non-secure resources mapped into its address space. So, the non-secure world runs in S-EL1 and S-EL0. seL4 occupies EL2, so this design doesn’t permit a non-secure hypervisor.

I’m happy to clarify any of these points if necessary (and in a timely manner).

I apologize for the delayed response.

No problem, I’m sorry my original response was much more delayed :).

I can’t think of any confidentiality of integrity implications of allowing non-secure non-device memory to be used as IPC buffers.

Ok I see your point. It would just mean that an untrusted device could access the contents of the IPC buffer. But this shouldn’t negatively impact the kernel.

Even if there is no non-secure OS running alongside seL4, you would still want to keep the kernel entirely within secure memory. Unlike secure memory, non-secure memory isn’t protected from non-secure devices on the system.

I think that this could be a policy/configuration choice? If there was an additional device memory protection mechanism, like an SMMU, then these devices could still be access controlled and allow the kernel to use non-secure memory. A specific example is if the non-secure memory space is much larger than the secure-memory space and there becomes memory pressure due to large number of pagetable and CNode objects that can only be stored in secure memory.

I can’t think of any confidentiality of integrity implications of allowing non-secure non-device memory to be used as IPC buffers.

If we wanted to support this with minimal changes, then marking all non-secure memory as device untyped would allow us to not create extra untyped subtypes. How important is being able to support protection domains that are entirely using non-secure world mappings if they are executing in secure world? They would already be using secure memory for their page tables, and so what is the limitations with using secure memory for the IPC buffer frame too?

You’re right about these kernel structures consuming precious secure memory resources. However, even a single CNode in non-secure memory subjects entire kernel to the vulnerabilities of non-secure memory. An attacker who manages to write to a CNode in non-secure memory can synthesize a capability for any secure untyped they want. Translation structures, on the other hand, can be placed in non-secure memory without degrading the security of the entire kernel. Neither a translation structure in non-secure memory nor any of its descendants can map secure frames (that is, contain entries with NS=0).

As you mentioned before, there may be cases where seL4 is running with NS=0 not to protect if from non-secure software and hardware resources, but rather just so that it can access secure hardware resources. In such cases CNodes and other kernel objects can live in non-secure memory. So, I agree that this policy should be configurable.

Not very important, I think. While it may be nice in principle for the virtual address space of a non-secure component to be backed entirely by the non-secure physical address space, I haven’t come up with a concrete security concern to justify the addition of a third policy for retyping within the kernel.

So far, I’m thinking that marking all non-secure memory as device untyped may not be the best approach to distinguishing between secure and non-secure memory. On a system implementing TrustZone, the secure and non-secure physical address spaces are actually two distinct 48-bit address spaces. So, the NS translation table entry bit can be thought of a 49th bit augmenting the usual 48-bit physical address space. Sure, for any 48-bit address addr on a given system at a given point in time, at most one of (addr, NS=0) or (addr, NS=1) is actually backed by memory or another hardware resource. However, on a system with a TrustZone Address Space Controller or TrustZone Protection Controller, memory and hardware resources can be moved between the two physical address spaces at runtime. On such a system, one might need untyped for both (addr, NS=0) and (addr, NS=1). I should note that system features similar to the TZASC and TZPC will become more common and more useful (with details to be released to the public in the coming months).

The need to have untyped covering both physical address spaces doesn’t necessarily complicate our design nor make it more invasive. The patch I linked in first post of this thread simply augments 48-bit physical addresses within the kernel with the NS bit as a 49th bit. There are only a few cases where we need to treat that bit as anything other than a 49th address bit. The main ones are when retyping untyped (we must determine whether the untyped is secure or non-secure in order to exercise the correct policy) and when mapping frames (the NS bit and 48-bit physical address are both present in translation table entries, but are not adjacent).

However, I’ve recently realized a problem with this approach. No all capability types have 1 bit of padding to spare as an additional address bit! For example, cnode_cap already uses all 128 bits. If only untyped and frames are permitted to be non-secure, then I guess that’s not as much of a problem, but it does complicate things a bit. However, as discussed above, that might not be an appropriate universal policy.

Another concern related to this suggestion has occurred to me. Device untyped is not cleared upon initialization or reuse. I would think that non-secure untyped backed by memory resources (what I proposed to be non-secure non-device untyped) should be cleared upon initialization or reuse like today’s non-device memory.

We already deal with a similar problem in 32-bit platforms that have > 512MiB of Ram. The kernel can’t address all of Ram in it’s virtual map. Ram that is not addressable is considered device untyped and it is the responsibility of a userlevel resource manager to clear it if it is reusing it in a different security/safety domain.

Is this still the most up-to-date version of your patches? I would be interested in a patch I could run on qemu’s virt machine that is without the EL2/virtualization changes (just what is required for the EL1 support). I could probably put something together myself based off of your branch if you think that that would be straight-forward.

Delegating this responsibility to userlevel sounds fine then, especially given the precedent.

Yes. This branch contains my current WIP patch for TrustZone support in the EL1 kernel.

I’ve just now put together a minimal demo based on that branch where capDL, running on secure resources, spawns two protection domains, one on secure resources and the other on non-secure resources. This README contains instructions for building and running it:

https://gitlab.com/arm-research/security/icecap/icecap/-/blob/nspin/wip/trustzone/README.trustzone.md

Here are the accompanying patches for capDL and elfloader:

https://gitlab.com/arm-research/security/icecap/capdl/-/tree/nspin/wip/trustzone

https://gitlab.com/arm-research/security/icecap/minor-patches/sel4/sel4_tools/-/tree/nspin/wip/trustzone

Hi @nspin,

Sorry it took me so long to get back to this.

I was able to build and run your example implementation! (Nix made this pretty easy, however the long initial build time did lead to a couple of false starts where I ran out of time waiting for it to build and had switched back to other tasks. For people trying to run the examples, it may be easier if the final binaries were made available for direct download or having a docker container that can be downloaded which already has the nix store populated).

My interpretation of the design:

Trying to clarify my own understanding, this example illustrates seL4 running in S-EL1 and managing secure world system resources, but also responsible for managing all non-secure system resources right? This means that the NS execution levels are all unused. This means that there isn’t a need to add a mechanism for receiving and responding to requests from a non-secure world software stack into seL4. Additionally, an seL4 secure world configuration would be mutually exclusive from an seL4 hypervisor configuration.

A conservative threat model would assume that devices in the NS world could potentially access any non-secure memory and so kernel objects need to be restricted to memory in the secure world address space. In your example, untypeds are created for both S and NS address spaces, but kernel untypeds are only created from secure memory regions and otherwise device untypeds are created.

Page tables are created out of secure memory (as there aren’t any NS kernel objects) and end up with the NS bit set to 0. Kernel frames are then mapped to S or NS memory based on the source untyped that they were created from and the paddr is used to specify the value of the NS bit in the PTE. The VSPace API doesn’t have to change because of this, but user-mode still needs to know how to decide the Paddrs from the initial untypeds into secure and non-secure address spaces.

I have some questions:

  • Does it matter what timer the kernel uses? There is a separate physical timer for the secure world, but there isn’t a separate virtual timer?
  • Should the TLB maintenance performed by the kernel change at all? It seems to me that it doesn’t as long as the NSTable field is always 0.
  • Is Normal memory allowed to be accessible via both secure and non-secure modes at the same time?
  • Are all IRQs set to Group 0 and routed as IRQs to S-EL1?
  • Looking at the way the system is initializing, ATF BL1 and BL2 are started and then the elfloader is used as the BL31 firmware and switches to S-EL1 before starting the kernel. Am I right in assuming that this means there isn’t any S-EL3 firmware installed to provide SMC interfaces such as PSCI? Would it be possible to instead load a real BL31 monitor that implements PSCI and have the Elfloader loaded as BL32 firmware? It seems to me this would be required to be able to bring up other cores and enable SMP via PSCI?
  • Thinking about how to set up hardware CI for this, do you know if it’s possible to load u-boot in S-EL1?
  • In the example output: Hello, non-secure world! is printed from a process that’s using non-secure memory resources, but still running in S-EL1 right?

To summarize

I support your proposed idea for initial TrustZone support for the Aarch64 kernel provided that my assumptions above are mostly correct.
I think next steps would be to create a GitHub PR and create an actual RFC on the RFC tracker.
Likely these changes will need to go behind a CONFIG macro for now.
What do you think should happen with discussions for S-EL2 support, or running a non-secure OS alongside seL4 and using seL4 as a trusted secure world OS?