Pre-RFC: Exposing Cross-Component TCBs via CAmkES Template

Summary
We recently developed a userland scheduled for MCS within the CAmKES service framework. In the process, we encountered an obstacle: The scheduling component itself must have access to the TCBs and SCs of the other application components it is designated to manage. The CAmkES infrastructure, however, in its current implementation does not have a mechanism to give the scheduling component such access. Specifically, the CapDL template instantiation support lacks a mechanism that exposes other TCBs and SCs to a component. We extended the CapDL template support with a function that satisfied this need. We present the modification here for further discussion and possible consideration to be included in the CAmkES baseline.

Motivation
Several userland schedulers have been developed for MCS. As far as we are aware, all have relied upon the native seL4 system calls, instead of using the CAmkES framework. Our use case specifically targets CAmkES, so we extended the CapDL template support. The extension itself is restricted to a single new function in Python. It does not alter any prior existing functions or source code.

Guide-level explanation
The new function is a variation of the “alloc” function used in CapDL templates, called “alloc_ent.” It allocates an object in the CapDL database, and it also exposes the TCBs and SCs of the object via the template.

Reference-level explanation
The specific modification we made is captured in the following patch file:

diff --git a/camkes/runner/Context.py b/camkes/runner/Context.py
index a483e25..5a4724d 100644
--- a/camkes/runner/Context.py
+++ b/camkes/runner/Context.py
@@ -87,6 +87,14 @@ def new_context(entity, assembly, render_state, state_key, outfile_name,
                                      (entity.label(), name), type, label, **kwargs),
                            **kwargs)),
 
+                # [2021/09/21:JCC] Varient of alloc to alloc for a different entity.
+                'alloc_ent': None if cap_space is None else \
+                (lambda name, type, label=entity.label(), **kwargs:
+                 alloc_cap((label, cap_space), cap_space, name,
+                           alloc_obj((label, obj_space), obj_space, '%s_%s' %
+                                     (label, name), type, label, **kwargs),
+                           **kwargs)),
+                           

We called the python function “alloc_ent.” We are not stuck on the name, feel free to rename.

Drawbacks
A poorly designed system could misuse this proposed capability in CapDL, which loosens security somewhat within CAmkES. However, the same basic capability already exists in MCS with the native system calls (by creating and assigning new TCBs and SCs), so the argument can be made that this new function in a CapDL template does not really open a new security concern.

Rationale and alternatives
CAmkES is designed so that special access to kernel data structures is defined through the CapDL and its template framework. Given that, this modest extension to CapDL is likely the best (or at least simplest) approach, assuming it is desireable for CAmkES to support a userland scheduler application.

Hi @jshackleton,

It allocates an object in the CapDL database, and it also exposes the TCBs and SCs of the object via the template.

This explanation doesn’t seem to line up with the code sample which doesn’t do anything specific for TCBs and SCs from what I can tell.

Is it that it allows you to write a template for the scheduler component that knows how to re-allocate SC and TCB caps to threads in other components based on knowing the component instance names and the protocol for how TCBs and SCs are created for each component?

However, the same basic capability already exists in MCS with the native system calls (by creating and assigning new TCBs and SCs), so the argument can be made that this new function in a CapDL template does not really open a new security concern.

I don’t follow here sorry. Can you elaborate more? Are you saying that because it’s possible for a policy to be set up using raw seL4 system calls that it shouldn’t be an issue making it possible to describe the same policy for a CAmkES and CapDL project?

I don’t foresee an issue with this addition, any template context can already achieve the following via direct access to the object and cap databases by directly manipulating the cap_space and obj_space objects. The security model here is that the final CapDL spec can be validated to confirm that the camkes templates didn’t create capabilities that contradict the design of the Camkes ADL file and this can even be done formally using CDL refinement proofs (camkes-tool/cdl-refine-tests at master · seL4/camkes-tool · GitHub).

Hi Kent,

Apologies for the poor explanations. Some clarification:

This explanation doesn’t seem to line up with the code sample which doesn’t do anything specific for TCBs and SCs from what I can tell.

Is it that it allows you to write a template for the scheduler component that knows how to re-allocate SC and TCB caps to threads in other components based on knowing the component instance names and the protocol for how TCBs and SCs are created for each component?

Your description is correct. The function is not creating a new object in the CapDL database, but rather gives access to the existing TCBs/SCs to the scheduler component via the template.

I don’t follow here sorry. Can you elaborate more?

In the current baseline CAmkES, the alloc function in the template language does not allow a component to access the TCBs/SCs of other components. An application created in native seL4, without CAmkES, is under no such restriction. That is, a native seL4 userland scheduler can create new TCBs/SCs and overwrite old TCBs/SCs as much as necessary. My original comment is trying to ask the question, is the restriction in CAmkES (or more specifically, the alloc function) a deliberate security measure? Your comments suggest that it is not.

I don’t foresee an issue with this addition, any template context can already achieve the following via direct access to the object and cap databases by directly manipulating the cap_space and obj_space objects.

We experimented with manipulating the cap_space object directly, and we couldn’t get it to work, mostly because the cap_space object is large and requires a thorough understanding of CAmkES and CapDL internals to traverse (a thorough understanding that I don’t seem to possess).

Thanks for the added explanation.

CAmkES probably has a missing concept for defining “monitor” or “supervisor” relationships between component instances to deal with concerns such as fault handling or scheduling. These relationships are a different class to existing connection relationships because they imply access to an entity’s objects that are also part of it’s standard connections, whereas standard connections are somewhat isolated from each other. Without this concept, you are left with having to implement something internally like what’s proposed here which I think is the best way you could achieve the functionality under the present constraints.

We called the python function “alloc_ent.” We are not stuck on the name, feel free to rename.

What do you think about alloc_unsafe for the name? This does open the door to hidden connections between components that would otherwise appear isolated from looking at the composition part of the CAmkES ADL spec.

Feel free to make a PR on the camkes-tool repo also.

What do you think about alloc_unsafe for the name?

Other names we have brainstormed:

  • alloc_transboundry
  • alloc_extern
  • alloc_other
  • alloc_unchecked

Any preferences?

Seems alloc_unsafe is inspired by Rust? Then the name should do.

Do you have a minimum working example of a CAmkES system somewhere that uses this function? I’d be interested to see this in a full context.

Yes, you can find a basic example here, which includes the above CAmkES modification as a patch:

Hi @jshackleton,
Would you like me to convert this change into a PR against sel4/camkes-tool?