`dask.order` over-prioritizes root tasks in some situations

In `test_anom_mean`—an example of a basic climatology workload that kicked off [long discussion](https://github.com/dask/distributed/issues/2602) about memory overproduction—it seems that an underlying problem might actually be with how `dask.order` prioritizes the graph.

Compare how the scheduler traverses the graph with  _queuing off and co-assignment off_ (i.e. scheduler behavior when the issue was first reported, prior to https://github.com/dask/distributed/pull/4967 and https://github.com/dask/distributed/pull/6614), versus with _queuing on as normal_:

![diagram](https://user-images.githubusercontent.com/3309802/221068199-83b5d249-374b-4420-84a8-3c9d8b665253.png)

<details><summary>Videos of the graphs executing</summary>

### Queuing on as normal:

https://user-images.githubusercontent.com/3309802/221064109-4d00faf2-ae90-4016-80c4-378229b12387.mp4

### Queuing off and co-assignment off:

https://user-images.githubusercontent.com/3309802/221064132-081ed099-cb05-4017-885a-b2f7948b9afd.mp4

---

</details>

Basically, the graph is prioritized such that the lowest-priority data loading task is still higher priority than nearly all of the data-consuming tasks.

Here's a mini version of the graph, with priorities:

![anom-mean-order-small](https://user-images.githubusercontent.com/3309802/221068914-19d5fb4e-a4bb-4849-aab6-74049b87d211.png)

<details><summary>Code to reproduce</summary>

```python
import dask.array as da
import numpy as np
import xarray as xr
from dask.utils import parse_bytes

data = da.random.random(
    (26, 1310720),
    chunks=(1, parse_bytes("10MiB") // 8),
)

ngroups = data.shape[0] // 4
arr = xr.DataArray(
    data,
    dims=["time", "x"],
    coords={"day": ("time", np.arange(data.shape[0]) % ngroups)},
)

clim = arr.groupby("day").mean(dim="time")
anom = arr.groupby("day") - clim
anom_mean = anom.mean(dim="time")

anom_mean.data.visualize(
    "anom-mean-order.png", color="order", optimize_graph=True, collapse_outputs=True
)
```

</details>

The last data-loading task is priority 106 (lower-left, second circle up). Looking at the data-reducing tasks, I only count 7 / 26 of them as being < 106. So if we follow graph order exactly, ~75% of the initial data will stay in memory.

So before queueing, workers had both root tasks and reduction tasks assigned. They could have run the reduction tasks—all the dependencies were in memory—they just chose not to, because they were lower priority.

<details><summary>To prove this, here are the tasks & priorities assigned to a worker with queuing off:</summary>

https://user-images.githubusercontent.com/3309802/221070444-1e8f7f71-43f8-444e-a21f-7ab3e27a1dfd.mp4

</details>

Queuing effectively added a new graph-prioritization mechanism saying "I don't care what your `dask.order` priority is, if you look like a root task, you always run last". It seems that for `test_anom_mean`, this re-prioritization of the graph might have been the main thing that solved memory usage, less the scheduler<->worker race conditions that queuing also addressed.

This is the same thing @fjetter said in https://github.com/dask/distributed/pull/7526#issuecomment-1431346387:

> this lets me think that a major contributing factor to the success of queuing is in fact that we are breaking priority ordering in a very specific way

The `dask.order` code is quite complicated, so I don't know how we'd address this there. Maybe the `is_rootish` heuristic is good enough and it's not worth trying? It'd be nice to understand the underlying cause though. Maybe @eriknw has ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`dask.order` over-prioritizes root tasks in some situations #9995

Queuing on as normal:

Queuing off and co-assignment off:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

dask.order over-prioritizes root tasks in some situations #9995

Description

Queuing on as normal:

Queuing off and co-assignment off:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`dask.order` over-prioritizes root tasks in some situations #9995