Performance comparison with non-L4 kernels

Hello everyone! I am a member of software team of a commercial vehicle oem. My boss is very interested in seL4’s performance compared to other industry-deployed kernels like QNX, RTOSs, and even Linux, in a fair setup.

Are there existing evaluation results? I found a quite fair comparison between sDDF and Linux in Design, Implementation and Evaluation of the seL4 Device Driver Framework, but I couldn’t find anything richer.

Depends on what you consider as fair. sDDF is using lwIP, so the benchmark you linked to is more a comparison between lwIP and Linux’s IP stack, than between seL4 and Linux directly (assuming main CPU usage happens in the IP stack and not in context switching).

That in itself is not a fair comparison: Linux is optimised for functionality and scalability, both across number of sockets, clients and number of CPU cores, while lwIP is single threaded and does not scale in number of interfaces, sockets and clients. sDDF works around that by running multiple instances with their own MAC address.

The benchmark only tests throughput for MTU sized packets, that is usually not the interesting load for your kind of systems, which usually cares more about small packet performance. To be fair, if measuring small packets per second performance, the same setups will probably give an even greater advantage to an seL4 setup.

So the question is how much effort you are willing to spend to optimise a system and what your system requirements are. seL4 will generally speaking require much more effort to get things working, as it’s closer to embedded software development than normal Linux user space development, and seL4 adds its own complexity. If you can use Microkit/LionsOS, then I highly recommend using that first, as it hides most of the complexity from the user, at the price of limitations in system design.

At my previous work which did critical voice communication systems I wrote our own IP stack for seL4 to get the scalability and performance that was required (sDDF and lwIP have unnecessary overhead and don’t scale, but didn’t yet exist anyway), but that was tailored to the system requirements we had (it only did IPv4 and UDP, for instance, though offloading TCP to lwIP would be trivial). The goal was handling hundreds of RTP streams for hundreds of processes on a relatively slow 1GHz, 4 core SoC (goal was 500 duplex streams, but how many exactly is possible depends on what else needs to be done other than just the RTP part).

But Linux was not an option, because it doesn’t have the assurance level we needed. QNX and vxworks were, let’s say, somewhat expensive for a small company, but the main problem was that they didn’t have a certified IP stack to the assurance level we needed, and IP is a single point of failure. So it would have been pouring money down the drain for nothing. That was a big motivation to use my own stack. Microkit didn’t exist back then, but it’s limit of ~60 tasks would be a show stopper.

I wrote a thin abstraction library that is ported to both Linux and seL4, so exactly the same software can run on either seL4 or Linux on the same hardware. I would consider this a fair comparison. Running the same load on Linux as the optimised seL4 software brings the system to its knees. I didn’t measure exactly, but it’s probably around a factor of 4 difference of what Linux can handle compared to the seL4 system.

If you have the right system design, then performance on seL4 will always be better than on Linux. I think Linux can maybe only be faster for highly multithreaded workloads with fancy NICs. But if you don’t care about memory separation, why bother using seL4? Just use a simple RTOS with even lower overhead. Compared to other RTOSes like QNX etc, I have no idea, it will depend on your workload, which IP stacks you’re comparing and how optimised the software is.

One reason to use seL4 back then was that it supports SMP. But keep in mind that the SMP configuration is not (mathematically) verified, so currently you do not get the same assurance level as the single core version. However, that will change somewhat soon with multikernel, which is basically running one instance of the non-SMP kernel on each CPU core (this requires user space to handle cross-core communication via shared memory and IPIs). Once seL4 and Microkit have support for that, it should be transparent to the user and easy to use. If you’re not using Microkit, it should still be relatively easy to port your SMP code to Multikernel without too much hassle.

How fair/apples-to-apples of a comparison is this LionsOS benchmark?

That was a big motivation to use my own stack. Microkit didn’t exist back then, but it’s limit of ~60 tasks would be a show stopper.

Does this limit represent the state of Microkit’s development or is this inherent to the design?

One reason to use seL4 back then was that it supports SMP. But keep in mind that the SMP configuration is not (mathematically) verified, so currently you do not get the same assurance level as the single core version.

Isn’t that kinda selling it short? There are no formally verified SMP kernels but it’s still considered a high assurance codebase, correct? What level of assurance would the current SMP codebase qualify for in EAL land?

LionsOS is build on top of Microkit + sDDF, so everything I said above applies equally to Microkit and LionsOS.

Does this limit represent the state of Microkit’s development or is this inherent to the design?

To clarify, strictly speaking the limit is not on the number of tasks the system can have, but on the number of channels per task. As the documentation says:

“The limitation on the number of channels for a protection domain is based on the size of the notification word in seL4. Changing this to be larger than 64 would most likely require changes to seL4.”

Theoretically you can work around the limitation by splitting servers into multiple co-operating ones, so in that sense it’s not an inherent limitation, only a limitation in system designs possible.

If they add badgeless IPC calls this would be easier to work-around, because the limit is in the badge value, but servers don’t always need to know which client is calling them.

Isn’t that kinda selling it short? There are no formally verified SMP kernels but it’s still considered a high assurance codebase, correct? What level of assurance would the current SMP codebase qualify for in EAL land?

Normal, verified kernel could easily be EAL7, if someone bothered to get it certified. SMP codebase breaks assumptions that are made for the proofs, so much lower, EAL6 at most, but that would still be difficult to achieve.

Normal, verified kernel could easily be EAL7, if someone bothered to get it certified. SMP codebase breaks assumptions that are made for the proofs, so much lower, EAL6 at most, but that would still be difficult to achieve.

Considering that the most any traditional operating system can hope to achieve is EAL4, you are DEFINITELY selling unverified seL4 short! AFAICT should qualify for EAL5 no problem and EAL6 would be achievable if it wasn’t cheaper to skip it and go for EAL7 directly.

I find this to be a VERY frustrating part of seL4’s communications: y’all hold yourself to a higher bar than anyone else on the planet. Sure, there are high-level proofs for some crypto libraries but is there any other production system that extends the proofs to the binary level? AFAIK there are no API changes anticipated from verification and thus the unverified stack would qualify as high-assurance 1.0 by any other measure.

That’s not how Common Criteria works.

First of all, no new kernels can hope to actually be certified under Common Criteria as kernels at EAL4 > 4, because the separation kernel protection profile that would have applied has been discontinued. You could probably still certify a kernel as part of a product, but the certification would apply to the product as a whole.

We really need to make a distinction between the assurance the certification scheme is trying to achieve and actually certifying according to the scheme. Eg. EAL5 would require semi-formal models and testing that do not exist for the SMP config of seL4. Could it be done? Sure. Does anybody want to pay for it? Let me know, we’ll be happy to help. The certification cost is dominated by the documentation and procedural side, not by the proofs (at least for seL4). EAL7 would require formal models and a formal design proof, which do not exist for the SMP version and would require significant new research to achieve. This is one reason the multikernel configuration of seL4 is on the verification roadmap, and not SMP – because much more of the unicore proof can be made to apply.

Even EAL4, which is entirely informal and is achieved by various mainstream kernels, is not a cheap exercise. It also doesn’t get you much assurance.

Does seL4/SMP have more assurance than a tradition EAL5 certification would provide? Maybe, maybe not. The truth is that there is no basis for comparison. The formal proofs do not apply to the SMP version, and there have definitely been serious bugs in the SMP version in the past. I don’t think there are currently any correctness-critical ones that we know about, but that does not mean much. Actual EAL5 certification would require a more rigorous testing regime than we currently have for seL4 and since the proofs do not apply, you cannot use them to replace testing.

You could argue that the defect density in seL4/SMP should be lower than in mainstream kernels, because there should be none in sequential code that is not relevant to concurrent execution. That is very likely true. But that’s pretty much all you can say. It most definitely does not qualify for EAL7 or achieves assurance similar to EAL7. I would not currently call seL4/SMP a high-assurance code base (i.e. something you would run a nuclear reactor on). High-quality, yes. High-assurance no.

The story is different for the configurations of seL4 that the verification does apply to. They surpass the actual assurance that would be gained by going through an EAL7 process. When the seL4 multikernel configuration is verified, it will as well.

Of course we do. One of the things we hold ourselves to is not making wrong promises.

We are not competing with kernels like Linux or Windows in terms of assurance. We are competing with the kinds of kernels that run planes, nuclear reactors etc. Some commercial separation kernels have achieved EAL6 or EAL6+ (basically EAL7 in terms of formal verification, but not all parts of EAL7 in other aspects of the scheme) when the separation kernel protection profile was still used. As far as I am aware of, these are all unicore kernels and all simpler and smaller than seL4. They do have formal models and formal model-level proofs. AFAIK, they do not have code-level proofs. They also have other certifications and have gone through fairly deep scrutiny.

I don’t know to what level the SMP/multicore versions of these kernels are certified if they exist. Not EAL6+, but maybe reasonable levels of safety schemes like DO-178C. That would be interesting to find out, but that tends to be not so easy, because their marketing tends to obscure such details. That is something I do not want to happen for seL4.

There used to be a research kernel with verified multicore support, somewhat similar to what seL4/multikernel does. It was not suitable for serious application, but it is not true that no other kernel exists that has a code-level multicore proof.

In any case, seL4/SMP is not the future for multicore. The verification future for multicore is the multi-kernel config, which also scales much better to larger numbers of cores.