SelfContainedMutex: Strengthen compiler fence#2552
SelfContainedMutex: Strengthen compiler fence#2552sporksmith wants to merge 1 commit intoshadow:mainfrom
Conversation
Using just `Acquire` here still lets the compiler move writes before the fence to after the fence. We need both `Acquire` and `Release` semantics. Pointed out here: https://masto.ai/@rfc1149/109359308547583544 And handy visual aid here: https://masto.ai/@rfc1149/109359347916681723
| // Ensure that the `futex.store` above can't be moved after the | ||
| // `sleepers.load` below. | ||
| std::sync::atomic::compiler_fence(Ordering::Acquire); | ||
| // Acquire: Ensure that the `sleepers.load` below can't be moved to before | ||
| // this fence (and therefore before the `futex.store` above) | ||
| // Release: Ensure that the `futex.store` above can't be moved to after | ||
| // this fence (and therefore not after the `sleepers.load`) | ||
| std::sync::atomic::compiler_fence(Ordering::AcqRel); |
There was a problem hiding this comment.
I'm not clear on why this is a compiler fence and not a memory fence? We can tell the compiler not to reorder the memory accesses, but that doesn't mean the hardware won't reorder them.
There was a problem hiding this comment.
Yeah, I think you're right.
I spent some time last night seeing if I could get a test up and running with loom. Unfortunately I ran into a crash in loom. After creating a smaller repro of that crash, I think it only happens when the code under test has a deadlock bug. So this gives some information, but doesn't give me a lot of diagnostics to find the details of the deadlock until the loom bug is fixed.
If I change this part of the code to unconditionally wake, the tests pass. I also tried changing this to be an atomic::fence, and changing all the orderings to use AcqRel, but I still get the crash (and hence presumably a possible deadlock) in those cases.
Digging in some more. I had an alternate design idea where I keep the sleeper count inside the same atomic as the futex word. It'd add some fiddly shifting and masking, and I think perform worse under contention (since multiple threads might race each-other trying to update the sleeper count and then sleep on the futex), but is easier to reason about concurrency-wise. If I can't find the problem in this version of the design maybe I'll take another look at doing it that way...
Codecov ReportBase: 67.28% // Head: 67.25% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #2552 +/- ##
==========================================
- Coverage 67.28% 67.25% -0.03%
==========================================
Files 193 194 +1
Lines 28608 28621 +13
Branches 5615 5618 +3
==========================================
+ Hits 19248 19250 +2
- Misses 4923 4936 +13
+ Partials 4437 4435 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
43c7e3f to
41030ed
Compare
|
Closing in favor of a pending more-complete fix. See #2557 |
Using just
Acquirehere still lets the compiler move writes before the fence to after the fence. We need bothAcquireandReleasesemantics.Pointed out here:
https://masto.ai/@rfc1149/109359308547583544
And handy visual aid here:
https://masto.ai/@rfc1149/109359347916681723