Pre-RFC: To effortlessly develop seL4 systems in rust

I’ve had a closer look at ferros and selfe-sys, and it achieves most of what we are discussing. Main differences I can see:

  • we probably want less tight cargo integration (i.e. not xcargo-based), which I think is now more easily achievable than when ferros was written (looking at GitHub - matklad/cargo-xtask for example to the post-build step of generating a final image for simulation or hardware)
  • Ahmed is right that ferros goes a bit far with static allocation, i.e. too many things are fixed via type at compile time. Getting that balance right will be the main challenge. I think a typed, but more dynamic library layer first would be good. Then, based on what kind of system people want to build, something like ferros could be put on top.

E.g. I would propose:

  • build integration similar to how ferros does it, but with some tweaks
  • using mostly the existing seL4 build tools for now (maybe later a rust-based elfloader/boot image builder etc could be done)
  • a generated libsel4-sys untyped library
  • a manually written typed libsel4 library on top of that that is still relatively policy free
  • (then later other layers and libraries)

I’m not quite clear yet on how the build info (config options) should be communicated. ferros uses a sel4.toml file that provides everything and is input ot the cmake step. This makes sense to me at the moment, but it’s one of the bits I’m least sure about.

I could see this for a manually written libsel4, but I’m not sure how that would work for the generated untyped libsel4-sys. I’m not even sure it makes sense to publish that layer on crates.io because it’d have to include the kernel.

Edit: to clarify: I’m really not sure. It might well make sense to have the bottom layer in crates.io, but I’m not clear on how. If I understand crates.io correctly, it’d be perfectly fine for that library to be generated, so maybe that all works out.

Currently not.

Does it really make a difference wether the kernel provides those values or a bottom-level generated library? If we really wanted to have them dynamically reported, that could happen entirely in the bottom-level library layer, which is already tightly coupled with the kernel anyway.

Crates have a build.rs file that is built and run before the crate itself is compiled. This allows for doing anything possible in Rust, including accessing the network, downloading files, running cmake, generating files that are later compiled by rustc, etc. Obviously, people tend to avoid using this in extravagant ways and there are people who would prefer to have an extensibility mechanism that doesn’t allow arbitrary code execution. Some people avoid projects that use it or at least they want to avoid them.

Either way, when a crate is downloaded from crates.io, it contains this file and it is built/executed by cargo on your machine as part of the build process. Most of the seL4/Rust integration crates use this in at least some way for some functionality. Hopefully that explains the mechanism.

Personally, I’m supportive of using the code generation support in libsel4 to generate a Rust library for a particular arch and using that to build a crate that supports the intersection of Rust/seL4 arch’s. This would then be uploaded to crates.io without needing a build.rs file (assuming the ABI issues I noted above are resolved). I don’t support generation based on all configuration parameters at build time for the following rationale.

In Rust, if I have 4 crates, the generated libsel4 (S), two helper crates (H1, H2) and a root task (R), with both S, H1 and H2 published to crates.io. Assuming that libsel4 uses build.rs to generate code, then when building R, all configuration options must be correctly provided and they have to be communicated to S. The standard way of doing that for published crates is cargo features. The most important property of these is that they should only add functionality in a backwards compatible way, because any transitive dependency can turn a feature on, which forces it on for the entire build. So, crate H1 can set a feature that impacts TCB size to one value, while H2 can set that another feature that impacts TCB size to another value. Both of them might then assume that they will get a specific TCB size in the end, but in this case, they would be used together and they would both fail. Hence, cargo features are a bad fit for seL4’s configuration options.

I’m actually not sure how the existing crates resolve this entirely, though hopefully that explains part of what makes Rust support (at least in the crates.io sense) complicated.

I agree that Rust features don’t map nicely on all seL4 configs (e.g. mcs and non-mcs are mutually breaking, they are basically different kernels), so they are probably not going to be the entire solution at least, but the specific scenario above sounds solvable to me. The resolved feature list of the entire build is the one that is passed to libsel4-sys crate (generated or not). It is the responsibility of that crate to export the correct values for sizes etc based on that feature combination, and those exported values/constants are what other crates should use.

Crates that can’t deal with certain sizes or combinations should fail compilation when they detect that (I assume there are compile time assertions that can check constants matching expectations and things like that). Well-designed crates would be agnostic in those values and just use them. Crates that are for specific embedded platforms with specific requirements might only support one specific combination of features and would be incompatible with others. I don’t think that is very specific to seL4.

For example the size of a TCB would be a constant defined inside libsel4-sys (as it is now in the C libsel4). The value of that constant would depend on the specific configuration, and its type might depend on architecture target. Depending on what kind of crate you are implementing, you would either just use that constant, or if you have specific requirements for size, you would check against it.

Clearly the current approach works, given that there are multiple existing crates that use the kernel as is. My motivation for proposing an alternative is that the current solutions seem difficult to deal with. I personally believe that some small, optional additions to the API would allow implementing a libsel4 variant that is agnostic to the more nuanced configuration options. That would optionally allow for a solution that was easier to deal with.

It’s possible that this solution and its benefits aren’t worth it, though I hope that at least it makes sense why some may find it appealing.

From what I’ve seen in the micro-kernel repo, the library level c-code only differentiates between the two with an #ifdef KERNEL_MCS or something to that effect. This kind of conditional compilation can be quite ergonomic using the feature… feature:

#[cfg(feature = "kernel_mcs")]
// mcs-exclusive interface

#[cfg(not(feature = "kernel_mcs))]
// non-mcs exclusive interface

A library would have this added to its Cargo.toml

[features]
kernel_mcs = []

# E.g. allows you to put the debug printing api behind a #[cfg(feature = "kernel_debug")]
kernel_debug = []

This would mean that you would have the following in your rust project:

// Cargo.toml

[dependancies]
sel4 = {version = "0.0.1", features = ["kernel_mcs"]

This is also where you have some scope for integrating some opinions into the codebase: You can set different features as a default. You could make it so that mcs is active by default, or the users can opt out:

sel4 = {version = "0.0.1", default-features = false, features = ["kernel_debug_print"]}

As I mentioned, cargo features are purely additive, not exclusive. Using them to select between exclusive functionality breaks any ability to reuse any code that opts into that feature, especially unknowingly.

We (my employer) have been developing a rust-based system on seL4 for the last 3 years and we would be very excited to have better Rust/seL4 integration. Sadly, we are currently unable to release the source code (although we remain hopeful for the future). For reference, our system is essentially an OS consisting of a root-task that becomes a program loader and a number (~12) of services written as independent executables.

Since, we’ve been working with seL4 and Rust for several years, I do have some recommendations for how to integrate the two. I also have anti-recommendations for things that seem like they would be beneficial that are not.

Most importantly, I strongly recommend avoiding using cargo to drive the seL4 build system to build libsel4 and the kernel. This was actually the original approach used by our system (based off of fel4’s libsel4-sys. However, for seL4 systems that consist of multiple Rust executables this means: 1) building seL4 for each executable (or at least each workspace) and 2) doing a separate final seL4 build of the kernel and elfloader (with the same configuration!). Instead, the approach that I think would work well is to have the seL4 build system generate a libsel4-sys crate or at least key source files for such a crate. This crate can be reused in all Rust executables AND the same seL4 build can be used to generate the kernel and elfloader.

Another major challenge is that Rust code probably needs to know about the seL4 configuration. For example, if you want to optionally enable seL4 debug or benchmarking calls if those APIs exist. Unfortunately, Rust isn’t designed to let higher levels query features of dependencies. So either you use Cargo features and propagate those to all your executable/library crates, or you need to expose the seL4 configuration from a libsel4-sys crate. Our design provides KERNEL_HAS_DEBUG and similar constants in libsel4-sys and then empty definitions of debug functions if debug functions are not enabled.

3 Likes

Definitely agree about features not mapping appropriately,
I think the right way to do this is probably using #[cfg(foo)] via setting the environment variable RUSTFLAGS="--cfg foo" instead of features.

As mentioned by others, I too would avoid using cargo, it is great when it works, and you are on it’s happy path. But I feel seL4 is really far off of cargo’s happy path, as cargo doesn’t have any mechanisms for running anything after linking, and simple stuff like getting the paths to binary outputs isn’t straight forward.

When I read the first paragraph, I was going to reply to say that I wasn’t sure cargo supported passing flags like that, but then I realized that you were recommending to avoid cargo and I also realized that RUSTFLAGS bypasses cargo entirely through the environment, so every crate that is built gets those configs specified (i.e. namespace pollution).

You’re right. some wires got crossed. My bad.

I does, but only via .cargo/config, and perhaps I came off too strongly in regards to avoiding cargo,
I was just generally agreeing with the sentiment of others who were advocating against invoking cargo at top-level, and running cmake/sel4’s build from build.rs, allowing cargo to drive the top-level build.

I just consider sel4 as having more targets than cargo, e.g. sel4 has root processes and processes,
while cargo just has bin targets. I think it is plausible we use something like corrosion, to drive cargo from within cmake, like others have mentioned.

Anyhow this probably came off and was interpreted more strongly than I intended it…

Seems like a valid criticism, but I’m not sure what else could be done, besides having a different target_os for each combination of preprocessor variables. It is the way rust provides conditional compilation, and these configs would be needed across crates, since they affect system calls, and userspace ABI, It seems like these really need to be globally consistent.

We can hopefully pick nice names e.g. prefixed with seL4, which will perhaps reduce the impact of namespace pollution. Lacking afaict any specific cfg layout for this, but perhaps we could ask rust folks about something like a target_os_features cfg.

I could be wrong, but my interpretation of people’s reasoning for having lots of these configuration options impact the ABI and that being justifiable is precisely that it should only impact a single crate instead of all of them. And that all other crates should simply use it in a way that largely ignores the configuration options (other than arch/MCS, of course).

Perhaps, I need to look at the other options, I was primarily focusing on MCS for the need for global consistency. It seems likely though that some of these cfg’s would be local to a single crate.

I suppose I view MCS as a temporary option while more verification work goes into it and users are able to migrate, since as far as I can tell, the only downside to using it is the work required to migrate. In that sense, I’d be fine if whatever Rust library is written today, simply doesn’t support non-MCS, personally.

I don’t think an “MCS only” support is the right way to go. One should not be forced out of using Rust in order to use seL4 in a non-deprecated manner.

If non-mcs deprecation is something that’s planned to happen, then sure.

That is indeed the plan, but it’s still going to take a while (1-2 years) until verification of MCS is done to a degree that we can make it the default.

It sounds like we should properly review the config options we have and decide which ones would need conditional compilation in Rust (such as MCS), and which ones are just values that are provided (such as sizes of objects) through the bottom-layer libsel4-sys crate once configured. (This still leaves consistency between the config options this crate has seen and the kernel build itself).

If it turns out that it is only MCS that really needs conditional compilation, we could make that part of the target architecture instead of a feature (e.g. seL4 and seL4-mcs). If there are more, then we need to think of something else.

Thanks @ratmice and @samueljero for that experience input. Let me try to summarise the problems with cargo that have been brought up (and please correct me if I get them wrong).

  1. a seL4 system has more targets than what cargo is built for. In particular, for a typical seL4 system you would want to build: one or more libraries incl libsel4-sys, one more more binaries (processes/components), one root task, a bootloader (like elfloader), the kernel, an overall image with all of these.
  2. finding the build outputs for all of the above from cargo is hard
  3. libsel4-sys, elfloader, and kernel build need to see the same config, and all crates need to see the same libsel4-sys config if that is how config options are propagated.
  4. the seL4 config options don’t map nicely to cargo features
  5. we should avoid rebuilding libsel4-sys, libsel4 or the kernel multiple times for multiple binaries/libraries if driven by cargo

It looks to me like we should still try to make cargo the user-facing interface if we want to encourage the Rust community to use seL4, including using crates.io etc to publish libraries or components for seL4. Wether cargo then actually drives the build or not could be an independent question.

Is that a reasonable summary?

I think (2) used to be the case but is now achievable via cargo metadata. AFAICS (1) means plain cargo is insufficient, but something like xtask could provide further targets like kernel, sys-image and simulate (kernel could potentially be just a crate). It’d be work to write, but it doesn’t look impossible.

A possible cargo setup for a typical seL4 project could be a workspace that includes all libraries and binaries that need to go into the final image. This includes the tasks that would build the kernel and the system image, and one dedicated location for the kernel config (could be CMakeLists.txt file for instance, but could also be something else – the main requirement is that libsel4-sys and the kernel build + other C builds like elfloader see the same file). Those together should make sure we don’t have unnecessary rebuilds and that everything sees consistent config (Rust through libsel4-sys and C/kernel through whatever config file was used to configure/generate libsel4-sys).

With that setup

  • cargo build would build only libsel4-sys and the Rust libraries + binaries
  • cargo image would invoke cargo build and then cmake for kernel + elfloader + image linking. It’d need to figure out where the cargo binaries were left and provide them to cmake. Potentially that step could also be invoking cmake and let cmake invoke cargo build via corrosion to get an overall build via cmake.
  • cargo simulate would invoke qemu with the right parameters (either provided in a script from a camke build or computed via the values it gets from libsel4-sys + target architecture etc)

Would that work?

One thing I think might be good is to add a section on stakeholders, here are the ones I could think of.

  1. kernel developers who primarily interact with the c code, but may want to occasionally look at or interact with rust code.
  2. end-user developers who currently interact with c-based seL4 code, and may want to migrate or add some rust based components.
  3. end-user rust focused developers primarily coming to seL4 from rust for its rust support and want it to reflect normal rust development practices.
  4. proof engineers who primarily with the proofs, and c-implementation with the expectation that rust support is not impactful.
  5. proof engineers focusing on rust based verification tools, who don’t currently exist or have anything to prove.

That makes sense, in the stakeholders above i’d say my focus is primarily on 1. and 2. being able to swap code out from c to rust, and providing a source organization which will at least be roughly organized in a familiar manner to existing kernel developers. I certainly feel like there is some amount of tension between the first two and the third.

This sounds like a good plan to me if we can get away with it, in that it seems least impactful on stakeholder 3.

I agree, it isn’t impossible, it does likely mean duplicating some stuff between cmake and xtask perhaps these are (already in some cases) or could be pulled out into commonly shared shell scripts,

stakeholder 2 likely has existing cmake c code they want to continue building, while 3 likely wants to just use cargo. If I understand your setup (in summary) cargo invokes cmake which can invoke cargo.
It does not seem unfathomable that it is possible it could work in a way in which stakeholders 2 and 3 can both coexist…

I think it is a good plan with a reasonable probability of success, with a pathway forward for all the end-user stakeholders…

One thing which might be worth considering is that we can theoretically check out via repo from build.rs.
I already have a manifest parser, and so we’d just need to do the git checkout/network stuff, if we want to try that…

Sorry if i’ve rambled a bit and exceeded my word budget.

Yes. The libsel4-sys crate should definitely be buildable via cargo and integrate as a normal cargo dependency into other crates. Otherwise, there will be severe friction with the rest of the Rust community. In my mind, the question is whether we can get a general libsel4-sys working or whether libsel4-sys should be generated from the seL4 build process, essentially only containing the config for the relevant build.

I think this is on the right path. I have a minor concern with the cargo image step. I think this step is easy when the application consists of a single root-task. With multiple processes (aka address spaces), this is harder because, AFAIK, there’s no way to get Rust/LLVM to create a single executable that promises that certain parts won’t share code. Hence, you pretty much always need to combine multiple executables into a root task. In our system, that’s done by tarballing up the elf files and supplying a real loader in the root-task. Other systems might take a more CAMKES like approach.

In either case, this assembly of the root-task needs to happen after the executables are created and before the elfloader is linked. In other words, right before your cargo image invocation. Hence, cargo image needs to not interfere with easy methods for doing that (build.rs) and needs to be flexible enough to give the user needed control over what the root-task is.

I may be unique in this, but build systems with “too much” magic are the absolute worst to use. Inevitably, I’m trying to modify a build system I didn’t write and don’t fully understand and the “magic” ends up getting in my way. A simple list of commands (ala Make or shell scripts) can quickly become easier to work with. To be clear, some magic is good, but we should be keep in mind that users who don’t fully understand the magic will need to modify and integrate it, especially at key points.