-
Notifications
You must be signed in to change notification settings - Fork 268
Spinning in the scheduler #2877
Description
In #2866 we found that if the worker threads spin while waiting for a new task rather than block, the runtime performance on the benchmark runner improves significantly. We should investigate this further, but should also test on systems other than the benchmark runner to make sure that the simulation performance doesn't become worse on other hardware platforms.
I've tried a few different spinning configurations:
1: Spin 100 times before blocking
Results: https://github.com/shadow/benchmark-results/tree/master/tgen/2023-04-13-T21-16-06
Shadow: dd651d8 (diff: 0fe3be4...dd651d8)
A rust mutex spins 100 times when locking, so I wanted to try that here. But in a tgen simulation, there's no performance improvement:
2: Spin u64::MAX times before blocking
Results: https://github.com/shadow/benchmark-results/tree/master/tgen/2023-04-14-T02-21-54
Shadow: d0c2f96 (diff: 0fe3be4...d0c2f96)
Using u64::MAX to spin indefinitely has a large performance improvement (the same improvement we saw in #2866):
3: Spin u64::MAX times before blocking and using std::hint::spin_loop()
Results: https://github.com/shadow/benchmark-results/tree/master/tgen/2023-04-14-T15-10-23
Shadow: 2017ac1 (diff: 0fe3be4...2017ac1)
Like the previous version we spin u64::MAX times, but tell the CPU that it's a spin loop using std::hint::spin_loop(). We see the same performance improvement, but this is maybe better for energy efficiency:
4: Spin u64::MAX times before blocking and using std::thread::yield_now()
Results: https://github.com/shadow/benchmark-results/tree/master/tgen/2023-04-14-T17-39-40
Shadow: 1cc31a0 (diff: 0fe3be4...1cc31a0)
Like the previous version we spin u64::MAX times, but use std::thread::yield_now() instead of std::hint::spin_loop(). We see an even bigger performance improvement.
5: Don't spin, but use a futex_wait timeout of 1 us
Results: https://github.com/shadow/benchmark-results/tree/master/tgen/2023-04-16-T03-05-38
Shadow: 6414048 (diff: 0fe3be4...6414048)
Had a small performance improvement, but not nearly as much as other approaches.
On the benchmark runner, it seems that spinning indefinitely with std::thread::yield_now() has the best performance.




