[core] Fix flaky ShutdownCoordinator test under tsan#56152
Merged
jjyao merged 3 commits intoray-project:masterfrom Sep 3, 2025
Merged
[core] Fix flaky ShutdownCoordinator test under tsan#56152jjyao merged 3 commits intoray-project:masterfrom
ShutdownCoordinator test under tsan#56152jjyao merged 3 commits intoray-project:masterfrom
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request correctly addresses a data race in FakeShutdownExecutor by introducing a std::mutex to protect concurrent writes to shared member variables. This resolves the write-write race condition. However, a read-write data race still exists in the Concurrent_DoubleForce_ForceExecutesOnce test due to unprotected reads. I've added a comment with a suggestion on how to fully resolve this by using thread-safe getter methods.
jjyao
reviewed
Sep 2, 2025
Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com>
Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com>
Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com>
1a7882d to
be1f2f7
Compare
jjyao
approved these changes
Sep 3, 2025
sampan-s-nayak
pushed a commit
to sampan-s-nayak/ray
that referenced
this pull request
Sep 8, 2025
…6152) Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com> Signed-off-by: sampan <sampan@anyscale.com>
jugalshah291
pushed a commit
to jugalshah291/ray_fork
that referenced
this pull request
Sep 11, 2025
…6152) Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com> Signed-off-by: jugalshah291 <shah.jugal291@gmail.com>
wyhong3103
pushed a commit
to wyhong3103/ray
that referenced
this pull request
Sep 12, 2025
…6152) Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com> Signed-off-by: yenhong.wong <yenhong.wong@grabtaxi.com>
dstrodtman
pushed a commit
that referenced
this pull request
Oct 6, 2025
Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
landscapepainter
pushed a commit
to landscapepainter/ray
that referenced
this pull request
Nov 17, 2025
…6152) Signed-off-by: Sagar Sumit <sagarsumit09@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
TSAN failure is a data race only in the test's
FakeShutdownExecutor, not production code. Fake was writing sharedstd::stringfields from two threads without synchronization -- https://buildkite.com/ray-project/postmerge/builds/12666#019907c0-3bdd-4401-9aa8-6f13215ce819/176-796std::mutextoFakeShutdownExecutorand guarded assignments tolast_exit_typeandlast_detailin allExecute*methods. No production code changes.FakeShutdownExecutorand used them in the assertion to eliminate remaining unsynchronized reads in the TSAN test.Related issue number
Closes #55801