Pre-RFC: To effortlessly develop seL4 systems in rust

We might want to clarify the jargon of having-sys appended to $CRATE_NAME-sys.

One common use, is for a -sys crate to be raw bindings, often unsafe, to some other interface. For example, uplink-sys is raw bindings to a C-interface to a C library of the same name. You would not typically use uplink-sys, but rather, you would use uplink, which uses its -sys as a compatability shim.

For seL4, what we ultamately want to provide is a libsel4 as an “Idiomatic, safe, and ergonomic rust interface to the seL4 micro-kernel”. This doesn’t mean there won’t need to be a lebsel4-sys crate somewhere in the ecosystem.

The -sys in the name is because most of these crates (at least originally) have logic in their build.rs to detect existing system libraries/headers and dynamically linking to those instead of building a local copy. In that sense, the sys means system, i.e. system crate linking against a system library.

added to the markdown

In that case, you won’t see any push-back from me if only supporting MCS makes this project more likely to see some real success

I’d like to try to focus the discussion on a very concrete scenario that I hope is pragmatic. I’m currently working through the AOS project for fun. One of the goals of the project is to run executables from a remote nfs share. I haven’t gotten that far, but I just ran this simple program on an odroid-c4:

#include <sel4/sel4.h>

static seL4_Word sos_syscall1(seL4_Word syscall_num, seL4_Word a1) {
    seL4_MessageInfo_t tag = seL4_MessageInfo_new(0, 0, 0, 2);
    seL4_SetMR(0, syscall_num);
    seL4_SetMR(1, a1);
    tag = seL4_Call(1, tag);
    if (seL4_MessageInfo_get_length(tag) != 1) { return -1; }
    return seL4_GetMR(0);
}

int main() {
    for (;;) {
        sos_syscall1(1, 'H');
        sos_syscall1(3, 60000000);
    }
    return 0;
}

I removed all the muslc stuff so it was just using libsel4 directly. This successfully prints H every minute.

The translation of this code to Rust is fairly trivial when targeting a no_std environment, assuming a suitable libsel4 to provide for the functions being called.

Given that I’m a Rust programmer coming to seL4, my preference would be that I should be able to put the Rust equivalent of that code in a src/main.rs and add libsel4 = "*" to my Cargo.toml and type cargo build --release --target aarch64-unknown-none, copy the file to the nfs share and run it. If we pretend we’re in an assumed-MCS world, I think this is a rather compelling scenario.

1 Like

this (and this post in general) gets a big +1 from me.

Awhile ago managed to get milestone 0 and 1 done, implemented with rust, though with some pretty bad build-hacks. Message me on mattermost (@ben, sorry other ben for pinging you before the edit) and I’d be happy to go through the code with you.

Since the last hangouts, there’s been a lot of things brought up. I’m gonna try and distill this conversation as much as possible into key points and filter by the points most useful for a such a hangout. I’ll also try to add some “discussion provoking statements/questions/etc.”,

The idea is to get the most out of this topic, without sucking the O2 out of the room for other discussions. I’ll add another comment when I’ve got something to show.

1 Like

for the hangouts. HackMD - Collaborative Markdown Knowledge Base

Sorry it’s not much. Was hoping to put together something useful, but some things came up, and now I need to go to bed. Feel free to edit.

I am a contributor to the Robigalia project, which had libsel4-sys and rust-sel4 crates for old versions of seL4. We definitely struggled with the issues already identified in this thread:

  • cargo features monotonicity
  • build dependency on seL4 config
  • seL4 config affects ABI

In addition to these problems, we found maintaining our own libsel4-sys difficult because we chose to duplicate all of the python scripts in the sel4 build system and modify them to emit rust, and keeping them in sync with the C version was tedious.

I strongly believe a more complete seL4 IDL definition for procedurally generating a libsel4 would go a long way in making a rust libsel4 more maintainable and might possibly help solve these problems.

A solution to the above issues we have been discussing but haven’t implemented is to generate a config-independent version of libsel4 that rust projects can build against.

This libsel4 would contain all possible implementations/definitions for all sel4 configs but would require relocations to be applied at runtime to work. Then, when building the sel4 kernel, an additional library with just the needed relocations could be built. Finally, at runtime there would need to be code that runs to apply the relocations.

[Edited to add]
Another possible benefit of having an IDL is an idiomatic version of libsel4 could be generated directly instead of having a separate -sys crate.

1 Like

This also came out of the developer hangout call yesterday, and it is something we’d be interesting in looking at more closely, esp if there are volunteers who would be able to spend some time on it.

A first step would be to collect what exactly is missing from the current XML definitions.

I think there would still be value in having a bottom-level -sys crate with functions that directly map to the relatively raw seL4 ABI, and a separate higher-level crate that has a more stable and idiomatic/typed interface based on that. Mostly because that idiomatic interface is likely not so easy to generate and will probably already need to make some minor policy decisions (e.g. to have or not have thread-local storage for the IPC buffer).

The -sys would in that case indicate bindings for the seL4 ABI, not bindings to a C FFI, but otherwise it’d be the same idea.

I have an idea for what I believe to be a strong approach for getting the seL4 kernel configuration into the Rust build and exposing it to Rust code. I’ve implemented this idea in IceCap as a demonstration. I hope that the following diff makes sense even without any familiarity with IceCap and its build system. Note that the diff is paginated by GitLab.

The idea is to embed the configuration as data into one low-level crate, which is imported by various other crates which use that data at build time. In this demo, that lowest-level configuration crate is called icecap-sel4-config-map. icecap-sel4-config-map/build.rs expects the path of sel4-config.json to be available as the value of an environment variable (SEL4_CONFIG). The icecap-sel4-config-static crate contains procedural macros which make use of icecap-sel4-config-map to evaluate #![cfg_sel4(...)] expressions at compile-time. The icecap-sel4-config-dynamic crate contains consts corresponding to the configuration, which are generated by icecap-sel4-config-dynamic/build.rs using icecap-sel4-config-map. icecap-sel4-config-{static,dynamic} are used by and re-exported by the main icecap-sel4 crate.

Here are some strengths of this approach:

  • The configuration is imported into the Rust build exactly once, so it is easy to account for changes using a single "cargo:rerun-if-changed={}".
  • The configuration flows throughout the build as code, which means we are aligned with the spirit of Cargo, rather than fighting it. This alignment is demonstrated by the fact that the implementation of this approach doesn’t require any Cargo subversion or other mischief.
  • Having a proc_macro crate which can refer to the configuration at macro execution phase is empowering. My idea for how to leverage this is the cute #![cfg_sel4(...)] syntax demonstrated below.

The new crates are in the following directories:

I haven’t fully integrated these new features throughout all of IceCap yet, but the following example of a root task demonstrates how they are used:

#![no_std]
#![no_main]
#![feature(proc_macro_hygiene)]

extern crate alloc;

use cfg_if_generic::cfg_if_generic;
use icecap_std::prelude::*;
use icecap_std::sel4::{cfg_sel4, config, BootInfo};

declare_root_main!(main);

#[cfg_sel4(not(KernelStackBits = "0"))]
fn main(_: BootInfo) -> Fallible<()> {
    debug_println!(
        "KernelRetypeFanOutLimit: {}",
        config::KernelRetypeFanOutLimit
    );
    cfg_if_generic! {
        cfg_sel4 if #[cfg(KernelArmDisableWFIWFETraps)] {
            debug_println!("KernelArmDisableWFIWFETraps");
        } else if #[cfg(any(not(KernelArmDisableWFIWFETraps), KernelNumPriorities = "0"))] {
            compile_error!("uh oh");
        }
    }
    Ok(())
}
1 Like

I’ve done some experiments with replacing the python scripts with rust tools,
The gist of it was:

  • Have a library crate which does the parsing
  • The tools then link to that crate, for code generation
  • Both tools and libraries, get compiled for and run on the host system as part of the kernel build.

The hope was that by separating out xml parsing from code-gen, the code generation would be easier to maintain…

This was described in the thread I posted here: but I probably have done a terrible job of trying to explain, how it works in the thread I created for this is here (so I guess I should try again):

There is a small patch to the kernel build scripts that adds an empty kernel/external directory, and recurses calling add_subdirectory for each subdirectory of kernel/external, (basically each subdirectory is assumed to have CMakeLists.txt)

The manifest
links in the libraries/tools and a cmake script which depends/runs them, this triggers everything to get built, then run.

With this setup it is entirely possible to link in the library, and swap in your own tools that depend upon the library which does the parsing code, and replace the cmake script to run your own tools

I wouldn’t say it is exactly straight forward with the code generation, cmake stuff calling the generators, and manifests linking them spread out across repositories, but it does at least give a high level library interface to the contents of the xml API which can be reused for various purposes.

It contains an existing port of the python c code-gen, which could eventually be used with e.g. CI to compare the output to the python generated code.

The idea is to embed the configuration into one crate (which, in this demo, is called icecap-sel4-config-map). This is done by passing sel4-config.json to icecap-sel4-config-map/build.rs via the environment variable SEL4_CONFIG. The build.rs of icecap-sel4-config-dynamic uses this data to generate a module of consts corresponding to the configuration, and icecap-sel4-config-static contains attribute macros which use this to evaluate #![cfg_sel4(...)] expressions. The substance of these crates is used by and re-exported by the main icecap-sel4 crate.

I agree, this approach looks very promising. We’ve done something similar where the build.rs reads the config information and embeds the important parts as consts. I’m not literate enough in proc macro to have made it quite as nice as this though.

Is this Pre-RFC an appropriate place to discuss/work on this or should it be a separate thing?

I started looking into this a long time ago but haven’t spent time on it lately and didn’t make much progress, but I’d love to continue working on it.

I started prototyping what a complete IDL might look like by starting from the information that can most easily be pulled from the existing libsel4 C header files and XML files.

I made a tool that reads these files and generates a single XML IDL file: robigalia / sel4-idlgen · GitLab.

The resulting XML IDL contains definitions for all possible configs. To generate an interface for a specific seL4 config, the IDL must be evaluated against the config. I have an example program in that repo that does this evaluation and just prints out the result in human readable form.

I think it would be interesting to pull this information into a database and make a website or utility to do queries/analysis on what is in libsel4 and what the different configuration settings affect. I think this would be able to identify all configuration settings that affect the ABI. I haven’t done this yet.

We have also started working on prototyping a libsel4 generated from this IDL, although it is very early, does not work and probably doesn’t compile right now, and there are major design changes to it that we’ve discussed and want to implement but haven’t done yet (and haven’t documented). robigalia / devbox · GitLab.

You might be right, but we’re curious if a single idiomatic, maintainable, and policy-free crate is possible, so we’re giving it a try :slight_smile:

Thanks for posting the diff @nspin. Sorry if you explained this already, but is Icecap’s current approach to use bindgen to generate libsel4 bindings for a particular configuration? Am I correct in interpreting that this diff is only adding a config library for accessing seL4 config options at build time using proc macros?

Does this approach essentially set up a path for a rust seL4 library that mirrors the libsel4 approach where there’s one source version of the library that supports a source seL4 version, but it can be built for each distinct seL4 configuration, and provides an interface for dependent libraries or applications to do the same?
If you wanted to write a program that had a single binary version that could support multiple configurations for a single source version, would you still use this approach or would you use a different one?

That’s correct. IceCap’s icecap-sel4-sys crate currently uses bindgen to generate bindings to libsel4, and links against it using LibSel4FunctionAttributes = "public". This diff does not change that, and just focuses on introducing information about the seL4 kernel configuration into the Rust build.

At cargo build-time, the build.rs’s generate code specific to the seL4 kernel configuration at hand, but that may not be what you’re asking about. Does the following fact answer your question? Unlike libsel4, the crate source is static, just like the crate source of openssl-sys is static. You could publish it to a registry. Any configuration-specific code is generated at cargo build-time in $OUT_DIR, just like in openssl-sys.

If I understand your question correctly, I think that the answer is no.

What do you mean by a single binary version that could support multiple configurations for a single source version?

Right, I should have been more clear sorry. Currently, libsel4 essentially needs to be built in 2 phases - it is first “compiled” into a C source form and then it is “compiled” into a binary form. Because both stages depend on kernel config options we currently need to recompile the sources and also recompile the binaries for each config change to the kernel.

My understanding at the moment is that one of the advantages of the approach you are presenting is that because rust has better support for compilation time computation (proc macros), you don’t need the same first compilation stage as libsel4 and just have a single implementation of the library source that can be distributed as a source package. So my question was about whether the binary format of the library would still need to be recompiled for each seL4 configuration, which I think the answer to is yes.

What do you mean by a single binary version that could support multiple configurations for a single source version?

I mean if you have a rust binary that’s been built against an seL4 target, would you still need to recompile it if you changed the seL4 kernel binary to one with a different configuration. I assume the answer to this is yes.

Edit: I made some edits to try and be clearer.

Ah, I misunderstood your question! Apologies.

That is correct.

This is an interesting question. Depending on the range of configurations you’d want a single userspace binary to support, this could be difficult. If the range of variable configuration points was limited to those like KernelRetypeFanOutLimit, this wouldn’t be too hard, but if that range included points like KernelIsMCS which have significant effect on the ABI, that would be a different story.

It might be easier to address a more specific version of this idea. What kinds of ranges of variable configuration points do you have in mind?

I think it should be possible to get away with using cargo:rustc-cfg=KEY[="VALUE"] as output from build.rs as described in:

https://doc.rust-lang.org/cargo/reference/build-scripts.html#rustc-cfg

Is there some reason why that would be insufficient? I’m generally against using proc_macro crates when they aren’t required, since they generally increase build times and make things harder to understand, overall.

One reason is that cargo:rustc-cfg=KEY[="VALUE"] does not propagate to dependencies.