Pre-RFC: Arm: Add TCB policy support for controlling user level access to certain system registers

Summary:
There are system registers that are useful to be able to access from user level. Two examples are system registers related to the Arm Generic Timer and registers related to the Performance Monitor Unit. There are already kernel configuration options that enable user access to these registers, however these configurations are all or nothing and aren’t allowed under verified kernel configurations. Instead, a better approach would be to add support for enabling/disabling access to these registers on a per-TCB and per-register basis that would allow the kernel to enable/disable access as part of a regular context switch. Authority to access these registers would need to be held by some hardware control capability.

Motivation:

Certain hardware features are unavailable to user level without explicit user access being enabled by the kernel. The generic timer provides access to 64 bit timestamp counters and per-core timer compare registers that can generate timeout interrupts. User level access allows application software to use these hardware features without interfering with any access requirements the kernel has. An implementation that allows user level access to be restricted to certain TCBs would allow systems to be built where efficient timer drivers can be implemented using the core-local timers and counters.

The performance monitor unit provides a 64bit cycle counter and 32-bit event counters that provide detailed information the performance of executing software. User access allows application software to monitor its own performance with fine-grained control. Being able to restrict access to these registers to specifically authorized TCBs would allow trusted components to use performance monitor counters to monitor system behavior without also allowing access to untrusted components.

Currently, although the kernel supports enabling access to these registers at user level, this cannot be done in a verified configuration as there’s no access-control associated with the registers. Adding the necessary access control would hopefully allow these hardware mechanisms to be used in verified configurations which would allow them to be used in actual production applications.

Guide-level Explanation:

There is currently the following kernel compile-time config options:

  • KernelArmExportPCNTUser: Allow user access to the register containing the Generic Timer physical counter value.
  • KernelArmExportVCNTUser: Allow user access to the register containing the Generic Timer virtual counter value.
  • KernelArmExportPTMRUser: Allow user access to the Generic Timer physical timer control and compare registers that control generation of physical timer interrupts.
  • KernelArmExportVTMRUser: Allow user access to the Generic Timer virtual timer control and compare registers that control generation of virtual timer interrupts.
  • KernelArmExportPMUUser: Allow user access to the PMU registers which control the performance monitor unit and read and write performance counter values.

When enabled, access to the above registers is currently allowed for any thread at user level. This creates communication channels and there’s no way to prevent unauthorized TCBs from tampering with intended register values.

Instead of these register accesses being all or nothing, we would create a new capability type, a hardware control capability, which authorizes its holder to configure TCBs to enable or disable access to these registers. If a TCB has been configured to have access to these registers then the kernel enables user level access when context switching to that TCB. When a TCB doesn’t have access to a register, user access is disabled when switching to that TCB.

As the kernel uses one of the timers in the Generic Timer, user level access to the same timer registers that the kernel uses would never be allowed.

Delegation of authority to enable or disable these system registers would be supported by deriving copies of the hardware control cap. As part of a capability derivation, the scope of authority could be reduced. This means it would be possible to give a “process” the authority to enable time stamp counter access to all threads it creates but restricting it from being able to also give those threads access to the PMU registers.

Revoking access to these hardware capabilities would result in the TCBs losing any access to the system registers that was enabled by the now-revoked capability.

There would be no additional storage added to a TCB object for saving or restoring state during a context switch that is related to these system registers. While there could be some value in being able to save and restore PMU counter state during a context switch, this is out of scope of this proposal.

On hypervisor configurations, the Generic Timer timer registers are already context switched when switching VCPUs. To keep this proposal simpler, these registers would not be available for TCB control as they are already managed by the VCPU object type.

The initial copy of the new hardware control capability would be provided to the root task in the initial CNode and described in the seL4_Bootinfo_t struct.

This mechanism is fairly generic, can be copied over to other architectures and can easily extended to other types of CPU enforced user level access control of hardware mechanisms. (E.G. cache maintenance instructions)

Reference-level explanation:

This section can be added if this gets as afar as an implementation

Drawbacks:

This would increase the amount of work the kernel would need to do during a context switch due to the added TCB lookup for system register access plus the modification of the control registers for changing the access control. It is expected that this additional work is << 5% of a context switch cost and can also be done inside the fastpath. This would need to be checked once there is an implementation.

Rationale and alternatives:

The main alternative is to do nothing. This would mean that the hardware features would continue to be unusable from user level in verified configurations. The mechanisms could still be used in unverified configurations, but this would prevent any production applications that could claim to use a verified configuration.

Prior art:

Linux gettimeofday() uses a user level implementation provided by the vdso that leverages the Generic Timer hardware mechanisms on supported platforms to access the time counters without requiring entering the kernel.

Unresolved questions:

Would this have any significant impacts on the security proofs?

From the verification side, I don’t see any immediate blockers for this.

As you point out, these registers provide communication channels, so for information flow guarantees we might have to forbid their use, but that doesn’t mean you couldn’t use them for other kinds of systems that run verified seL4.

It’s not quite clear to me yet wether having direct access to very precise time would allow a thread to find out information about other threads who do not have access to these registers. If yes, we might have to remove access completely for information flow, but if not, we could potentially model these registers as a common information source+sink, i.e. we would assume as an over-approximation that everyone who has access to them can communicate with everyone else who does. Some of this may also be mitigated by the time protection mechanisms that are under development.

For most real systems that care about performance, having direct access to low overhead, precise timing is crucial. But if such systems also have untrusted components which should get as little as possible timing info, this feature would make it possible to disable it for those processes.

[RFC-16] - Jira lists a use case for PMU.

It’s not quite clear to me yet wether having direct access to very precise time would allow a thread to find out information about other threads who do not have access to these registers. If yes, we might have to remove access completely for information flow

Yes, but the same is true for a shared cache and other shared hardware resources. And usually there are alternative ways of getting precise timing info, like busy looping. So access to timer registers makes some timing attacks easier, but it’s usually not crucial.

It’s also a question of what is visible in the formal model and what is not. If it’s visible, it’ll simply not be possible to prove absence of information flow, because there is visible information flow. If we want to ignore timing channels, we need to make sure they are not visible in that model.

From the thinking in [RFC-16] - Jira it would be easy to just forbid the existence of this kind of cap in the entire capability state when you want to use the information flow theorem, though. I.e. if you can’t use the cap, because it’s been deleted, then there shouldn’t be a problem, and if you do want to use the cap, because the area is safety, not protecting secrets, then you’ll not be using the infoflow theorem and don’t need to satisfy that capability state assumption.

That would just be an assumption on the capDL state of a system, not a general kernel assumption for all systems, so that would be entirely possible.

Does this mean that if you had a scenario with 3 components and 3 labels A, B and C and a policy describing that A cannot read any information from C and that B was completely isolated from A or C, that if you added the capability allowing B to read the system timer, then you wouldn’t be able to prove that A cannot read any information from C anymore?

It depends a little on how exactly the flow becomes visible in the model, but likely yes. The proof doesn’t go by component of a concrete system, it abstractly goes “for any component in the system and any wellformed policy, if this component does an operation, does any other component see something they shouldn’t be seeing according to the policy”. If that fails, it doesn’t really matter if there might be other things in the system for which the theorem still holds, we still can’t get to the end of the infoflow theorem itself.