-
-
Notifications
You must be signed in to change notification settings - Fork 757
Open
Description
Follow-up from #6371 and #6385.
CC @gjoseph92
test_deadlock_cancelled_after_inflight_before_gather_from_worker removes a worker while another worker has a task in flight from it.
The test calls
distributed/distributed/tests/test_worker.py
Lines 3282 to 3284 in 9bb999d
| await s.remove_worker( | |
| address=x.address, safe=True, close=close_worker, stimulus_id="test" | |
| ) |
which in turn calls
distributed/distributed/scheduler.py
Lines 4172 to 4174 in 9bb999d
| if close: | |
| with suppress(AttributeError, CommClosedError): | |
| self.stream_comms[address].send({"op": "close"}) |
which in turrn calls
distributed/distributed/worker.py
Line 1576 in 9bb999d
| await self.rpc.close() |
Expected result
The RPC channel is explicitly shut down. The waiting gather_dep call on the other worker raises OSError within milliseconds.
Actual result
The RPC channel is left dangling until the TCP timeout kicks in (5s by default; in the test it's been shortened to 0.5s).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels