Looking at the CLH big lock in the kernel, it uses
sel4_atomic_exchange to add threads to the queue of cores. On x86 this uses gcc’s
__atomic_exchange_n however on arm (aarch32 and aarch64) this is hardcoded to use the load/store exclusive instructions.
I’m playing around with the lock on aarch64 and noted that
__atomic_exchange_n is implemented on my toolchain (7.4.0). Even better, if I set the
KernelArmMachFeatureModifiers to include
+lse it will generate code that uses the latest armv8, large system extension instructions, which has more performance atomics.
I’m happy to put up a PR, but I’m curious as to why arm does not use
__atomic_exchange_n. Any ideas? cc @amirreza.zarrabi.