I have realised that our OpenMP benchmarks fail with high intra-node thread counts due to execution integrity issues.
To reproduce the issue, compile the kernels-omp/dgemm kernel, and run the following:
inv run.pool kernels-omp dgemm --cmdline '4 2 1024 32'
it should fail most of the times complaining about execution integrity issues.