Prevent idle workers in the work stealing policy#1047
Prevent idle workers in the work stealing policy#1047robgjansen merged 3 commits intoshadow:masterfrom robgjansen:event-sched-bug
Conversation
|
I'm running a before/after Tor simulation on this to understand the performance implications, but the code could be reviewed in the meantime. |
sporksmith
left a comment
There was a problem hiding this comment.
Approving since this looks correct AFAICT, though I have some suggestions for clarity/maintainability.
src/main/core/scheduler/scheduler.c
Outdated
| if (scheduler->policyType != SP_SERIAL_GLOBAL) { | ||
| /* when everyone finishes starting the hosts, the main thread can prepare | ||
| * round 1 */ | ||
| countdownlatch_countDownAwait(scheduler->finishBarrier); |
There was a problem hiding this comment.
I was a bit confused both about reusing finishBarrier here, and why wait on prepareRoundBarrier below doesn't already accomplish what you want.
I see now that prepareRoundBarrier is actually waited at the end of a round in scheduler_continueNextRound. Would moving it to the start of a round solve the bug without having to (re)use another barrier here, as well as being more consistent with its name?
There was a problem hiding this comment.
I had first added a new barrier, but then decided to re-use the finishBarrier since it wasn't needed at the start. But I can see how that could lead to issues later, so I will go back to using a separate one.
The prepareRoundBarrier needs to be at the end, because the state that is updated while the workers are waiting at the barrier (i.e., the clock) is what the workers use to break out of their running while loop. If I moved prepareRoundBarrier to the beginning, then we would need an extra check and break statement---which is certainly possible but also more lines of code. (Perhaps transitionRoundBarrier is a better name.)
Ultimately, I'm not really trying to refactor this admittedly ugly code right now, just trying to minimize the changes needed to get to a correct state. That being said, I do want to rewrite the scheduler in rust (in the dev branch) at some point, because (i) I want to implement a continuous round scheduler to replace the current discrete round scheduler, and (ii) I think we can design a more efficient scheduler/scheduling policy now that we don't have to worry about migrating plugin thread-local storage anymore. But this is all for another month :)
| g_rw_lock_reader_unlock(&data->lock); | ||
|
|
||
| /* make sure the workload has been updated for this round */ | ||
| while (!__atomic_load_n(&stolenTdata->isStealable, __ATOMIC_ACQUIRE)) { |
There was a problem hiding this comment.
I think this looks correct, though could potentially spin more than we'd like, and in general atomic operations outside of implementing higher-level sync primitives is a bit obscure. Maybe a barrier could work instead of a raw atomic?
There was a problem hiding this comment.
(Or if spinning actually is what you want here, maybe a spin lock?)
There was a problem hiding this comment.
I don't think a pthread_barrier will work here, because I want the bool functionality - I only want the threads waiting for it to turn true to wait, but not the thread that sets it to true. I think pthread_barrier requires all threads that decrement the count to wait until it reaches 0.
I was planning to use a CountDownLatch but then got excited when I realized I could do it with an atomic with hopefully less synchronization overhead. (The countdown latch uses both a mutex and a condition variable.) I'm using the atomic to prevent a race condition so I don't expect the wait will be very long, so I think spinning is OK.
pthread_spin_lock could work, but it's a bit kludgy. If we are using N workers then it's possible that N-1 of them reach this point. I just want the N-1 workers to check/wait for the boolean to become true. If I use a spin lock, I would need to use the unlocked state to represent true, meaning that the N-1 workers would acquire and immediately release the lock every time they want to check if they can steal from another worker. For N workers, this happens N^2 times per round. The atomic seems simpler.
Thoughts?
There was a problem hiding this comment.
I don't think a pthread_barrier will work here
...
I was planning to use a CountDownLatch
Oops, yeah CountDownLatch was what I meant.
but then got excited when I realized I could do it with an atomic with hopefully less synchronization overhead. (The countdown latch uses both a mutex and a condition variable.) I'm using the atomic to prevent a race condition so I don't expect the wait will be very long, so I think spinning is OK.
I think the fast path without contention would be the ~same? But yeah, the main practical benefit would be avoiding a potential infinite spin. (re?)reading your comment, it sounds like spinning is probably ok.
pthread_spin_lock could work, but it's a bit kludgy
Hmm, I see what you mean.
Maybe a semaphore (or even Ryan's spinning semaphore) would be a more appropriate abstraction?
Anway, I'm fine with leaving as-is. It makes the code a little more opaque to directly use these low level atomic functions here, but it's not a big deal.
There was a problem hiding this comment.
OK, I'm leaving as-is (Ryan's spinning semaphore is a great idea but only available in dev branch atm).
This fixes two main issues:
Closes #1046