Insert yields into the interpreter to improve fairness #5820
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a counter to execution of the interpreter. Each time a function application occurs (either fast or slow path), the counter is decremented. When it reaches 0, the interpreter calls
Control.Concurrent.yieldbefore continuing. The counter starts at 5,000.The goal is to avoid certain thread starvation scenarios. GHC's thread preemption is based on memory allocation. If a thread is executing code that does not allocate any memory, the RTS has no way of preempting it. It's actually possible to write loops in unison that cause very little memory to be allocated, meaning that such threads could cause other threads to starve.
The counter itself seems to be free (actually, the altered code seems to be optimized better, making numeric code faster), so the only performance implications are the overhead of the yield, which is determined by the frequency. 5,000 is only a little slower on our benchmarks, but considerably improves latency on a thread starvation benchmark (from seconds to milliseconds).
Benchmarks comparisons vs. the 0.5.44 release. I know it looks like most things got worse, but the release build actually got better timings than my local build from before I made these changes, so maybe my local builds are just a little worse in general. The ones that actually seemed worse are the
fibs and (for some reason) CAS on IORefs.