I’m glad I did, it’s really quite fascinating. To set the stage a little, L4 is a microkernel architecture with various implementations. Last time I looked it was mostly used by academics and was not very mature, but it seems to have moved on – for example, L4Linux is a version of it with Linux hosted on top, providing benefits like driver isolation.
seL4 is a variant of L4 that is based around a capability architecture (I confess I don’t know enough about L4 to understand how one variant can be capability-based without them all being, pointers welcome), which would be interesting in itself. But far more interesting than that – seL4 has been formally verified.
The approach is somewhat convoluted. First of all, they write a prototype of the kernel in Haskell, including all of the low-level hardware control (note that because this is a microkernel, this does not include most of the device drivers). This prototype is then linked to an emulator derived from QEMU, which allows user code to run in a binary compatible simulated environment. When user code calls the kernel, the emulator instead branches into the Haskell code, runs it, and then returns to the user code. This in itself would be pretty cool, but there’s more.
Next, there’s the abstract specification of the kernel. This describes the binary layout of arguments to system calls, the effect of each system call and of interrupts and traps (e.g. page faults), without describing the implementation. From the Haskell, they generate what they call an executable specification. This is done by an automatic translation from the subset of Haskell they use into a theorem proving language. They then prove that the executable specification is a refinement of the abstract specification. Which means, informally, that everything that is true of the abstract specification is also true of the executable specification. In other words, the executable specification implements what the abstract specification specifies.
However, we’re not there yet. The real kernel is written in C, by hand. This is required if you want a kernel that is appropriately optimised – automatic generation of C code from the Haskell would be possible, but unlikely to run fast. The C implementation is then verified by automatically translating it into the theorem proving language once more, and showing that this is a refinement of the executable specification. This automatic translation is another remarkable achievement, though they do not claim to be able to translate arbitrary C: it must be written to conform with some rules – for example, no passing addresses of local variables to functions. Clearly these rules are not too onerous: they manged to write a whole kernel that conformed to them.
Note that this does not rely on the correctness of the Haskell code! Since that is only used to generate the executable specification, which stands between the abstract specification and the C implementation, all we really care about is the correctness of the executable specification, which is proved. Although this strongly implies that the Haskell code is correct, it is in no way relied upon.
So, the end result of all this is a number of fun things.
- Most importantly, a kernel written in C that has a formal proof of correctness.
- A Haskell prototyping environment for the kernel (I guess I’d like this part even more if I could figure out how to learn Haskell without sitting next to a Haskell guru).
- The tools needed to adopt this approach for other code.
- A nice write-up of the whole thing.
- Detailed costs for doing this kind of proof, and the costs of making changes. See the paper for more, but the big picture is the whole thing took 20 man-years, of which 9 were development of (presumably reusable) tools and 11 were the formal seL4 proof itself.