`decide_worker_rootish_queuing_disabled` assertion fails when retiring worker

While stress-testing https://github.com/dask/distributed/pull/7062, `test_RetireWorker_stress`, which gracefully removes the best part of the cluster while performing a very heavy computation, failed once out of 162 runs:

https://github.com/crusaderky/distributed/actions/runs/3114670981/jobs/5050785452#step:18:1674

```
2022-09-23 18:56:03,193 - distributed.scheduler - ERROR - (<WorkerState 'tcp://127.0.0.1:63881', name: 6, status: closing_gracefully, memory: 21, processing: 27>, {<WorkerState 'tcp://127.0.0.1:63869', name: 0, status: running, memory: 61, processing: 6>, <WorkerState 'tcp://127.0.0.1:63879', name: 5, status: running, memory: 59, processing: 14>, <WorkerState 'tcp://127.0.0.1:63885', name: 8, status: running, memory: 59, processing: 17>, <WorkerState 'tcp://127.0.0.1:63877', name: 4, status: running, memory: 58, processing: 5>, <WorkerState 'tcp://127.0.0.1:63887', name: 9, status: running, memory: 59, processing: 6>})

Traceback (most recent call last):

  File "d:\a\distributed\distributed\distributed\scheduler.py", line 2040, in transition_waiting_processing

    if not (ws := self.decide_worker_rootish_queuing_disabled(ts)):

  File "d:\a\distributed\distributed\distributed\scheduler.py", line 1901, in decide_worker_rootish_queuing_disabled

    assert ws in self.running, (ws, self.running)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`decide_worker_rootish_queuing_disabled` assertion fails when retiring worker #7063

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

decide_worker_rootish_queuing_disabled assertion fails when retiring worker #7063

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`decide_worker_rootish_queuing_disabled` assertion fails when retiring worker #7063