L4v CRefine Failures Caused by Kernel Code Modifications

In order to make it comply with certain industry software standards/guidelines, the seL4 kernel C code has to be slightly modified. In some cases, the modification causes l4v failures at CRefine.

Example:

  • The original invokeTCB_ReadRegisters has 2 returns, one in an if body and the other at the end of the function, which violates the “single point of exit at the end” guideline.
    ...
    if (e != EXCEPTION_NONE) {
    return e;
    }
    if (call) {
    ...
    }
    ...
    return EXCEPTION_NONE;

  • Modification: introduce a ret variable
    ...
    if (e != EXCEPTION_NONE) {
    ret = e;
    }
    if (call && e == EXCEPTION_NONE) {
    ...
    }
    if (e == EXCEPTION_NONE) {
    ...
    ret = EXCEPTION_NONE;
    }
    return ret;

This causes CRefine to fail at line 2024 of Tcb_C.thy, while proving lemma invokeTCB_ReadRegisters_ccorres. I suspect the new variable assignments for ret introduced new internal states which the original proof work doesn’t account for.
BTW, both versions generate identical Assembly code (objdump kernel.elf).

Questions:

  1. How to avoid l4v failure in this case (if possible) by modifying the function in a different way?
  2. Is there a “best practice” for modifying kernel code (e.g., according to some industry standards) without affecting l4v tests?
  3. If the ccorres proof has to be updated, how? (study proof engineering from scratch…?)
  4. Would the seL4 team be interested in developing certain standard-conforming+verified versions of the kernel?

Feels like you are raising a general thing there - which of the “certain industry software standards/guidelines” should/must the source code comply with actually? It might make sense to work this out explicitly via the RFC process. I can see a certain demand here, because we have been running into this also, but decided to exclude the kernel and just make all the rules apply for userland code only. Rationale is, that the kernel source is out of scope, because a formally verified kernel binary is to be used in the end.

Hi Jack,

Since the objective is to change the control flow of the function, it is unlikely that there is any such change that will not break the proof. The proof needs to argue precisely why the current control flow is correct, so if you change that control flow, the argument will need to change with it.

There is no way to make non-trivial/structural changes to verified code without changing the proof. The proof argument usually follows the code structure very closely.

Your main options for changes to verified seL4 are either the above, or hire someone who already is fluent in seL4 verification, or contract the work to someone (for example to us at Proofcraft).

In all cases, the change still has to pass pull request review for the seL4 repository. Maintaining a separate different version of seL4 will not allow you to call it seL4 any more according to the trademark rules of the foundation.

I think I can say that the TSC will not want to maintain multiple different versions of the kernel for multiple different standards, and certainly not multiple different verifications for those versions. However, it might make sense to bring the main code base into compliance with one or two main standards if that is funded and brings a benefit. As Axel suggests, that is worthy of a wider discussion. If anyone has opinions on that, please do speak up.

In this specific case it looks like you are aiming to apply MISRA C. Are you able to give details on what the benefits are of applying that standard?

The usual argument is to improve safety and code quality, but that argument does not make much sense for a formally verified code base. The verification already provides a much higher degree of safety and provides direct evidence of correctness (in fact, I would argue the specific MISRA C rule above often (not always) degrades code quality, and does so here by obscuring control flow using state – I think compliance could be achieved without this specific pattern, still using a return variable, but it is an example that just following the standard does not always improve quality). I do realise that there is not always a choice and some certification schemes simply require compliance to specific standards, but even in that case there is usually a question whether the kernel code is in scope for that or not (as in Axel’s use case).

That said, if the changes to make the seL4 code base MISRA C compliant are relatively small and harmless (i.e. can be done without reduction in code quality), and if there is funding available for the corresponding verification work, there might be no strong objection to do that.

MISRA C 2012
Gerwin got it right.
Scope of the screening, it’s a good point worth looking into.

1 Like

A wider discussion would be great.

According to a Coverity MISRA check, the kernel currently has ~3000 “advisory” errors, ~4000 “required”, and 0 “mandatory”. I understand the kernel is already high-assurance but for now (without much knowledge of the certification scheme) I’ll look into the “required” ones (already narrowing down the check, as the example above would be excluded as a mere “advisory” rule).

Also, my example above generates identical kernel.elf disassembly before and after modification. Doesn’t that mean the two programs are the same? At the end of the day the machine executes the binary not C. Does this imply ccorres can sometimes be a bit too strict on the c implementation, and perhaps l4v could add a pre-check of binary diff (just for the sake of showing equivalence)?

It does mean the behaviour of the two programs is the same yes, so it is a good indicator that it should be possible to fix up the proof (as long as the C code is still in the supported subset, there might be situations where that subset conflicts with MISRA C, that would definitely be interesting to find out).

Yes, the proof are more strict in that sense. The proofs are relating the C code to the specification, and the binary to the C. So changing the C, but not the binary can still break them just by making the previous argument invalid (not because the code is wrong, just because the argument doesn’t apply to the new code).

For a period you could sustain the reasoning that you have proof for binary X and changed the C code and still have the same binary X, so everything is good, but as soon as the original code and proof change, you now have a binary Y and with the new code + your changes the compiler might not produce the same binary Y any more, and then you are stuck (there is no guarantee that the compiler will always produce exactly the same code, e.g. it might reorder branches etc).

If that happens, there is no good way to repair that argument. If you instead change the proof, you will be able to find a new proof for the changes as long as they have the same behaviour and security properties.

We have in fact been discussing a binary-equality check for pull requests to show that a change definitely hasn’t changed behaviour, and so a proof should at least still be possible or that the change was fine for unverified code. The check would not be on l4v, though, and preserve the proofs, it would just be an assessment tool.

2 Likes