Hi seL4 friends, I’d like to share an experimental project on seL4 microbenchmarking inspired with my experience with sel4bench suite. Your thoughts on the concept and its practical use will be greatly appreciated.
Measuring execution times of short code paths, which we can meet in seL4 microbenchmarking, is a good way to immerse oneself into the domain of compiler and microarchitectural optimizations.
Implementation of “early processing” measurement methodology in sel4bench suite (for Notification delivery case, as a trial step; PR #26) was an amazing experience in that respect. So, I gladly share the work highlights to explain my motivation.
First, latencies of some microarchitectural events can be comparable with execution time of a measured code path. So, if microarchitectural event not planned by a benchmark design happens during a measuring period, deviation of the observed value considerably increases.
Second, the both kinds of optimizations of instrumental code cause instability of measured values from one measurement iteration to another.
Third, debugging the instrumental code is very challenging. Injection of diagnostic code into it changes effects of the both types of optimizations; therefore, initially witnessed symptoms effectively transform to other ones.
When you think about how to control optimizations effects, the situation appears uninspiring, considering the need to treat various microarchitectures and optimization features.
As a result, the situation left me with the question on hands: how to distance a benchmark design from necessity to control optimizations effects in instrumental code that impair repeatability? Eventually, it reduced to the foundational one: What we measure, and Why?
This experimental work is attempt to find and try the solution for the challenge: designing more predictable measurement context.
The work was published in github.
Project manifest repository
Tables of data collected on four TS platforms: files KBenchTable-xxxx.pdf in