Add more ordering diagnostics to `dask.visualize` by eriknw · Pull Request #7992 · dask/dask

eriknw · 2021-08-04T04:26:05Z

I added these to help investigate #7929

The names could probably be better. This also needs documented and tested, but I thought I'd share, because, you know, pretty plots! I think all options here are potentially useful. Anything else we might want to show?

Speaking of pretty plots, I'll share some tomorrow. Good night!

Tests added / passed
Passes black dask / flake8 dask / isort dask

I added these to help investigate dask#7929 The names could probably be better. This also needs documented and tested, but I thought I'd share, because, you know, pretty plots! I think all options here are potentially useful. Anything else we might want to show? Speaking of pretty plots, I'll share some tomorrow. Good night!

Also, update so that visualize `color="pressure"` includes memory usage when the task is run (shown on the function) and when the data is released (shown on the data).

eriknw · 2021-08-04T16:53:31Z

As promised, here are a few graphs. These show memory pressure--the number of dependencies that are held in memory when a task is run (shown in the function circle) and when the data of a task is released (shown in the data rectangle).

Here's a simple example to get you started:

Note that all these are created with color="order-pressure", and cmap="plasma". The label on the functions is the order number and the memory pressure. If we only wanted to show the memory pressure, then do color="pressure".

Another one that is slightly more complicated:

And another:

And a more complicated one:

And the example from #7929 (main branch)

And the same one using PR 7929:

To compare the two graphs above more directly, it would be nice to have them on the same scale. So, we can now pass maxval= to visualize, which changes the previous graph to:

I think this is pretty nice. I don't know why I didn't whip this up long ago!

ncclementi · 2021-09-23T16:48:08Z

@eriknw Checking in here, is this PR still WIP, or is it in a state ready for review?

eriknw · 2021-09-23T17:19:27Z

Thanks for checking! Still WIP, and not forgotten. Feature-wise, it could be reviewed.

TODO:

docstring
tests

eriknw · 2021-10-27T00:34:14Z

Okay, I think this is ready. Naming and describing things was somewhat challenging.

I chose the diagnostics that I think will be most useful when trying to understand what dask.order is doing, especially when it does something poorly. Let me walk through an example where we investigate sub-optimal ordering (sigh).

Let's consider the Dask graph from da.arange(N, chunks=1).cumsum(0, method='blelloch'). Here's the familiar visualization with color="order", cmap="autumn":

We can see that the first half (the red at the bottom) is ordered well, but there may be some issues with the second half (yellow and orange nodes are intermixed).

Let's look at age, color="age", cmap="plasma":

Indigo colors (most of the graph) are good, but we can clearly see that the data from some nodes (reddish and orangish) are held in memory longer than desired. But, why?

Here we look at how many more outputs are held after the lifetime of each node,color="memoryincreases":

Large values may indicate nodes that should have run later. Indeed, here, the yellow nodes in the upper left are clearly run far too soon.

Similarly, we can look at how many fewer outputs are held after the lifetime of each node,color="memorydecreases":

Large values may indicate nodes that should have run sooner, which is indeed the case for the yellow nodes in the middle.

Now that we have an idea of what's going on, let's look at a more complicated, but very informative visualization--color="memorypressure". This one indicates how many outputs are held when a node is run (the circle in the diagram) and when the output of the node is released (the rectangle in the diagram).

This tells the same story as the previous diagrams, but it does so differently.

This last diagram shows how many dependencies are released when a node is run, color="freed".

This can be a nice view, because it also shows patterns differently. It is clear that something is amiss in the upper left portion, because there are many tasks that don't release dependencies (but should) when they are run.

So, where does dask.order go wrong? Here's a closeup view of the middle section with color="order":

Node 151 (the yellow node in the middle) should have run immediately after node 81, but didn't, or node 112 should have run after node 81 (to work towards running node 117).

eriknw · 2021-10-27T16:33:14Z

I forgot a plot in the example investigation in the previous post. The second value returned by dask.order.diagnostics is the number of outputs held over time. Here's an example:

import dask
import dask.array as da
import pandas as pd
import hvplot.pandas

A = da.arange(33, chunks=1).cumsum(0, method='blelloch')
info, num_in_memory = dask.order.diagnostics(dict(dask.base.collections_to_dsk([A])))
df = pd.DataFrame(
    {
        'time': list(range(len(num_in_memory))),
        'Num in memory': num_in_memory
    }
)
# rasterize=True is nice for very large graphs
df.hvplot(x='time', y='Num in memory', rasterize=False)

num_in_memory is a list, and is very useful, which is why I have it as a separate return value. For example, I used it here: #7583 (comment). It is not strictly necessary to be a return value, since it can be reconstructed from the data in info above. For example:

assert num_in_memory == [
    val.num_data_when_run for val in sorted((val for val in info.values()), key=lambda x: x.order)
]

I still like it as a separate return value to make it easier to use.

jsignell · 2021-11-02T14:51:28Z

This a big diagnostic improvement and I love the narrative that you go through in #7992 (comment) - can we make that into a blog post or a how-to?

quasiben · 2021-11-02T15:12:52Z

+1 to writing up as a blogpost

eriknw · 2021-11-02T19:29:56Z

Aw, thanks for the kind words!

Yeah, I can write a blog post from this, but you must know how much it pains me to show off dask.order not doing well 😛 .

I'm about to go on vacation (hooray!), so it'll be at least a few weeks before I can get around to it.

jsignell · 2021-11-03T13:44:19Z

no worries, anytime is a good time :) it can be framed as how to understand or debug.

eriknw added 3 commits August 3, 2021 23:18

isort

238bf70

Move computation to dask.order.diagnostics

d1271b8

Also, update so that visualize `color="pressure"` includes memory usage when the task is run (shown on the function) and when the data is released (shown on the data).

eriknw mentioned this pull request Aug 4, 2021

Change to dask.order: be more eager at times #7929

Merged

2 tasks

eriknw mentioned this pull request Aug 13, 2021

Regression: poor memory management in threads and sync schedulers #7583

Closed

eriknw added 2 commits October 22, 2021 18:37

Merge branch 'main' into ordering_diagnostics

f0a86f3

Add docstrings and test for dask.order.diagnostics

5970e54

eriknw marked this pull request as ready for review October 27, 2021 00:34

eriknw changed the title ~~WIP: Add more ordering diagnostics to dask.visualize~~ Add more ordering diagnostics to dask.visualize Oct 27, 2021

jsignell merged commit 89d93a8 into dask:main Nov 2, 2021

jsignell mentioned this pull request Jun 1, 2022

Clearer colourmap / labels when showing task order in visualize #9069

Open

fjetter mentioned this pull request Aug 22, 2022

Factor out and instrument task categorization logic - static graph analysis dask/distributed#6922

Open

eriknw mentioned this pull request Feb 24, 2023

dask.order over-prioritizes root tasks in some situations #9995

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add more ordering diagnostics to `dask.visualize`#7992

Add more ordering diagnostics to `dask.visualize`#7992
jsignell merged 5 commits intodask:mainfrom
eriknw:ordering_diagnostics

eriknw commented Aug 4, 2021 •

edited

Loading

Uh oh!

eriknw commented Aug 4, 2021

Uh oh!

ncclementi commented Sep 23, 2021

Uh oh!

eriknw commented Sep 23, 2021 •

edited

Loading

Uh oh!

eriknw commented Oct 27, 2021

Uh oh!

eriknw commented Oct 27, 2021

Uh oh!

jsignell commented Nov 2, 2021

Uh oh!

quasiben commented Nov 2, 2021

Uh oh!

eriknw commented Nov 2, 2021

Uh oh!

jsignell commented Nov 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

eriknw commented Aug 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eriknw commented Aug 4, 2021

Uh oh!

ncclementi commented Sep 23, 2021

Uh oh!

eriknw commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eriknw commented Oct 27, 2021

Uh oh!

eriknw commented Oct 27, 2021

Uh oh!

jsignell commented Nov 2, 2021

Uh oh!

quasiben commented Nov 2, 2021

Uh oh!

eriknw commented Nov 2, 2021

Uh oh!

jsignell commented Nov 3, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eriknw commented Aug 4, 2021 •

edited

Loading

eriknw commented Sep 23, 2021 •

edited

Loading