Which code is verified?

The verification sees the entire C code for one particular combination of configuration options. Currently this is the imx6 platform, Cortex A9 processor, ARMv7-a architecture, all other config options unset (in particular DEBUG, PROFILING, etc). Excluded from this C code is the machine interface and boot code, their behavior is an explicit assumption to the proof.

You can see the exact verification config options in l4v/spec/cspec/c/Makefile. The machine interface are the functions that correspond to the ones in the Haskell file Hardware.lhs.

You can further inspect the gory details by looking at the preprocessor output in the file kernel_all.c_pp in the proof build - this is what the prover, the proof engineer, and the compiler see after configuration is done. So a quick way of figuring out if something is in the proof input or not is checking if the contents of that file change if you make a change to the source you’re wondering about. You don’t need the prover for this, and only parts of the seL4 build environment setup.

The top-level proof makes statements about the behaviour of all of the kernel entry points, which we enumerate once manually in the proof. The prover reads in these entry points, and anything that they call must either have a proof or an assumption for it to complete its proof. If anything is missing, the proof fails.

That means all of the C code that is in this kernel_all.c_pp file either:

  • has a proof,
  • or has an explicit assumption about it,
  • or is not part of the kernel (i.e. is never called)

The functions with explicit assumptions are the machine interface functions mentioned above (they’re usually inline asm) and the functions that are only called by the boot process (usually marked with the BOOT_CODE macro in the source so they’re easy to spot).

As an example, the CPU and architecture options mean that everything under src/arch/ia32 is not covered by the proof, but that the files in src/kernel/object are.

1 Like

Since the time we wrote this, two further architectures are now supported for verification: Intel x64 (called X64 in the proofs) and the hypervisor extensions of ARMv7 (called ARM-HYP in the proofs). The original ARMv7 proof is called just “ARM” in the proofs, and there is a fourth architecture in progress, RISCV64.

The description Anna posted above for how to figure out which code is included in the verification works for all of these architectures the same way. Before you start the procedure and generate the kernel_all.c_pp file, you can set the variable L4V_ARCH to the architecture value {ARM, ARM_HYP, X64} to get the respective configuration.

Note that while the ARM configuration has the full proof stack including functional correctness, security theorems, system initialiser, and binary verification, the other architectures so far only include functional correctness. Functional correctness is the hardest and the core part of the proof, but having the other properties actually proved for the other architectures would of course be even nicer.

I might add a distinction.

Everything that goes into one of the verified builds (the kernel_all.c_pp files for the various verification targets) is processed by the verification toolchain and needs to be in the verification C subset. There are also some additional constraints for binary verification to work.

However, not all that code is verified. There’s a lot of parts of kernel init that are processed by the verification toolchain but nothing is proven about them. There’s also a more complicated pattern involving some of the platform-specific code, where the verification stops typically stops at some low-level function whose correctness is assumed. A lot of these assumptions are written out in, for instance, l4v/proof/crefine/ARM/Machine_C.thy .

Agreed – that is what was meant by "all of the C code that is in this kernel_all.c_pp file either:

  • has a proof,
  • or has an explicit assumption about it,
  • or is not part of the kernel (i.e. is never called)"

in Anna’s post. The file you mentioned (l4v/proof/crefine/ARM/Machine_C.thy) is probably the best place to get the list of machine assumptions. The other one is l4v/proof/crefine/ARM/Init_C.thy, which states the boot assumption.