Ensure distributed.comm.core.connect can always be cancelled#6064
Closed
Ensure distributed.comm.core.connect can always be cancelled#6064
Conversation
Contributor
Unit Test Results 16 files ± 0 16 suites ±0 7h 27m 30s ⏱️ - 11m 23s For more details on these failures, see this check. Results for commit 676bb18. ± Comparison against base commit fc5b460. ♻️ This comment has been updated with latest results. |
b544ac6 to
bda34ea
Compare
5d2fce9 to
73a4da2
Compare
fjetter
commented
Apr 10, 2022
Comment on lines
+1626
to
+1645
| task.cancel() | ||
| await watcher.wait() | ||
| raise |
Member
Author
There was a problem hiding this comment.
I'm not entirely sure if I need to await the task itself here. There are no warnings
cc @graingert
73a4da2 to
8d125a0
Compare
fjetter
commented
Apr 12, 2022
| ], | ||
| ) | ||
| @gen_test() | ||
| async def test_connection_pool_cancellation(monkeypatch, closing, backend): |
Member
Author
There was a problem hiding this comment.
These tests do not capture this race condition but the rewrite is much cleaner and covers more ground so I'd like to keep it
8d125a0 to
c4e1120
Compare
Member
Author
|
There is a CPython issue which looks like the root cause, see python/cpython#86296 |
c4e1120 to
676bb18
Compare
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I noticed that tests like
test_worker_waits_for_schedulerare not properly running even though the test logic is extremely simple.The test
test_worker_waits_for_schedulerparticularly hangs in theasyncio.wait_forstatement waiting for the worker to come up.After some debugging, it appears the worker is stuck in the connect retry implementation trying to connect to the scheduler. This connect should've been cancelled after the
wait_forhits its timeout but it wasn't.It appears there is a race condition when cancelling a task. If the Task is cancelled right after it is finished and the exception is set, it will not raise a CancelledError but the original Exception. I don't know if this is a bug in CPython but it is not the behaviour I anticipated. I wrote the
ensure_cancellationwrapper to make sure this behaviour is deterministiccc @graingert
Closes #5861