Drop distributed pack by fjetter · Pull Request #9988 · dask/dask

fjetter · 2023-02-21T14:25:38Z

dask/layers.py

fjetter · 2023-02-21T15:37:55Z

dask/layers.py

+        annotations = annotations or {}
+
+        if "priority" not in annotations:
+            _splits = self.get_split_keys()
+
+            def _set_prio(key):
+                if key in _splits:
+                    return 1
+                return 0
+
+            annotations["priority"] = _set_prio
+        super().__init__(annotations=annotations)


I find the annotations API a bit unintuitive. I would've expected a mapping like

{"key": {"priority": 42}}

but instead it is a

{"priority": {"key": 42}}

More or less it's the same thing but I find myself more naturally iterating over keys than over annotations

fjetter · 2023-02-22T15:57:07Z

dask/layers.py

+        if "priority" not in annotations:
+            _splits = self.get_split_keys()
+
+            def _set_prio(key):
+                if key in _splits:
+                    return 1
+                return 0
+
+            annotations["priority"] = _set_prio
+        super().__init__(annotations=annotations)


This change is needed because the distributed scheduler does not handle the expanded annotations

dask/highlevelgraph.py

fjetter · 2023-02-23T10:48:19Z

See #9994 for an intermediate step with the functional changes w/out dropping the actual code

fjetter · 2023-03-21T14:43:15Z

dask/highlevelgraph.py

-    def __reduce__(self):
-        """Default serialization implementation, which materializes the Layer"""
-        return (MaterializedLayer, (dict(self),))


There is actually no need to have any custom reducers. The old behavior was to materialize all layers by default when pickled. That's clearly not what we're after

fjetter · 2023-03-21T14:49:36Z

dask/layers.py

+        annotations = annotations or {}
+        self._split_keys = None
+        if "priority" not in annotations:
+            annotations["priority"] = self._key_priority
+
+        self._split_keys_set = set(self.get_split_keys())
+        super().__init__(annotations=annotations)
+
+    def _key_priority(self, key):
+        if key in self._split_keys_set:
+            return 1
+        return 0


It's worth pointing out that an earlier version of this caused a regression earlier see #9994
and #10041 with the fix and a reason why this wasn't merged, yet. It accidentally triggered materialization, no longer an issue with this PR

Thanks for the explanation. I'm just learning now that annotations can be callable - cool :)

continuous_integration/environment-3.10.yaml

rjzamora

Things are looking good to me @fjetter - I appreciate your work on this!

My only real concern is about backwards compatibility: I have heard of at least one Dask/RAPIDS user deploying a scheduler process on GPU-free hardware. I'm worried that users like this will start to see errors when pickle.loads starts to require device-memory allocations on the scheduler.

I think it is fine for distributed to tighten the official environment and hardware requirements. However, I think it is important that we acknowledge that this decision is being made, and that the decision will likely break some real-world code. Note that I was originally hoping that we could provide a temporary escape hatch that would materialized the graph and use legacy communication. However, it is not clear to me that the pre-HLG graph-communication logic still exists?

mrocklin · 2023-03-22T02:23:42Z

My sense is that we're choosing to be more picky about serialization in order to reduce maintenance burden. We acknowledge that there are a few cases where this will inconvenience users but we think that those cases are few enough that the benefits outweigh the costs. I think that this decision was made months ago and now we're done with execution. It's time to pull the trigger I think.

If you'd like to support that user then maybe there's something that can be done within the pickle protocol outside of Dask. I'd very much like for us to get out of this game. It's been way too expensive for the project in the past to justify continuing to pay that cost.

rjzamora · 2023-03-22T02:47:56Z

I think that this decision was made months ago and now we're done with execution. It's time to pull the trigger I think.

I certainly agree with this - Sorry if my comment made it seem like I'm trying to block anything. Is there a place in the documentation (probably distributed/deployment) where it can be made clear that the environment should be consistent everywhere in the cluster? Or does it already say this somewhere? It probably makes sense for the RAPIDS documentation to cover the GPU-on-the-scheduler detail.

If you'd like to support that user...

I'm only interested in supporting a workaround if it was a few lines long. Otherwise, I just want to make sure we are documenting the new requirements (so there is a clear reference when the issues come in).

fjetter · 2023-03-22T12:50:54Z

Is there a place in the documentation (probably distributed/deployment) where it can be made clear that the environment should be consistent everywhere in the cluster? Or does it already say this somewhere?

There is a section in the deploy documentation stating

For Dask to function properly, the same set of Python packages, at the same versions, need to be installed on the scheduler and workers as on the client

https://docs.dask.org/en/stable/deployment-considerations.html

This section does not mention anything about hardware requirements. I guess this specifically affects GPUs. is there anything else affected? I guess it would make sense to have a GPU section somewhere in https://docs.dask.org/en/stable/deploying.html or to add something in the GPU specific section in https://docs.dask.org/en/stable/gpu.html

fjetter · 2023-03-22T13:02:09Z

If the "GPU on scheduler" truly is a big issue and is considered a show stopper for some users, there is still an escape hatch...
After my refactoring the deserialization and materialization is static. For the most part this is also true for the annotations. There is a very clean separation between deserialization, graph materialization and consolidation with the scheduler state.
All of this could be "offloaded" to a Worker which has the proper environment. Of course, this would require the worker to send back the fully materialized graph and more which is not optimal but possible without the complexity of the pack/unpack protocol.

I would strongly prefer not doing this to both keep complexity low and to maintain flexibility but it is possible.

rjzamora · 2023-03-22T13:22:44Z

I appreciate the clarification @fjetter - My sense is that the relevant documentation is already sufficient on the Dask side and that the escape hatch you are describing is probably not worth the effort.

jacobtomlinson · 2023-03-22T13:43:05Z

I'm +1 on this change with the goal of reduced maintenance burden. However it will be breaking for a subset of users and I would like us to make sure that failure modes are as pleasant as possible for those users.

Common things I've seen in the wild:

Users creating a different minimal environment for the scheduler
Users not including a GPU on scheduler nodes (but using the same software environment)

My worry is that users in these groups will run into unpleasant and hard to debug tracebacks (this is usually the user experience we get when we change serialisation things).

It would be nice if we can publish a documentation page or blog post with info on this change and how users can update their deployments to comply with the new harder restrictions. Then either catch exceptions related to this and raise a new error with a link to the docs, or include example tracebacks in the docs page so that folks will find it via googling the error.

fjetter · 2023-03-22T15:44:21Z

dask/tests/test_distributed.py

-
-    # Parse transition log for processing tasks
-    log = [
-        eval(l[0])[0]
-        for l in s.transition_log
-        if l[1] == "processing" and "simple-shuffle-" in l[0]
-    ]
-
-    # Make sure most "split" tasks are processing before
-    # any "combine" tasks begin
-    late_split = np.quantile(
-        [i for i, st in enumerate(log) if st.startswith("split")], 0.75
-    )
-    early_combine = np.quantile(
-        [i for i, st in enumerate(log) if st.startswith("simple")], 0.25
-    )
-    assert late_split < early_combine


This test was not only flaky but also useless. The change to the shuffle split priorities that were introduced here #7846 are actually quite important but this test is not testing this. The time when a task transitions to processing on scheduler side is quite irrelevant since we're scheduling greedily. split tasks are always transitioned immediately after the grouper ended up in memory. The only case where this might not be true is if a combine task would "unlock" at the same time such that there is a race in the transition. Actual execution order is however ensured by the worker.

This test is a bit of a chicken egg situation. the test tried to capture this behavior without actually asserting that priorities is set. It still used a rather low level API. I kept the intention of not asserting on priorities but had to fiddle with the worker internals a bit. If this turns out to be unstable/hard to maintain we can reconsider. The important thing is: This test is now sensitive to the priorities set in the shuffle layer 🎉

Nice! I never liked this test very much :)

fjetter · 2023-03-22T15:45:49Z

dask/layers.py

        # Return SimpleShuffleLayer "split" keys
        return [
-            stringify((self.split_name, part_out, part_in))
+            (self.split_name, part_out, part_in)


💢 😡

callables are evaluated on the actual, non-stringified keys. If this is a sane behavior or not is something I cannot truly judge but this is how it's been before. This is changing now because the __expanded_annotations__ sentinel no longer exists.

Evaluation on non-stringified keys makes sense to me. For example we might want to prioritize based on the index number ("read-parquet", 15) -> 15

Evaluation on non-stringified keys makes sense to me. For example we might want to prioritize based on the index number ("read-parquet", 15) -> 15

I think from a user perspective, this definitely makes sense. It's all just a bit confusing when dealing with internals.

mrocklin · 2023-03-23T13:38:32Z

It would be nice if we can publish a documentation page or blog post with info on this change and how users can update their deployments to comply with the new harder restrictions. Then either catch exceptions related to this and raise a new error with a link to the docs, or include example tracebacks in the docs page so that folks will find it via googling the error.

No objections. Is that something that the RAPIDS team can help with?

jacobtomlinson · 2023-03-23T15:14:43Z

Is that something that the RAPIDS team can help with?

Sure happy to help. We'll definitely need input from @fjetter though.

fjetter · 2023-03-24T16:57:26Z

From what I can tell, I found and fixed the last regression. Currently running another set of benchmarks to confirm. If they don't flag anything suspicious, I will move forward with merging this Monday morning unless there are any objections until then.

rjzamora · 2023-03-24T18:14:19Z

I will move forward with merging this Monday morning unless there are any objections until then.

I think we are comfortable with this on the RAPIDS side as long as today's Dask release goes smoothly. We will want to pin the RAPIDS-23.04 release to dask-2023.3.2 to avoid a possible scramble if the pickle move causes any problems.

fjetter · 2023-03-27T14:51:27Z

Benchmarks came back and look good. For a couple of workflows we even get a speedup due to reduced overhead.

Ok, benchmark results are finally in as well. This time w/out any regressions

Wall Time (right side is null hypothesis, i.e. main vs. main to measure noise; Left side is this PR).

What we can see is that there is not much to see. This change is not intended to change scheduling behavior or speed anything up. These benchmarks mostly confirm that we're dispatching the proper computations.

There is one sizable performance improvement in the test case test_trivial_workload_should_not_cause_work_stealing which is indeed connected to the refactoring. This test case is generating a couple of thousand delayed objects and is computing them embarrassingly parallel. This refactoring is actually shaving off a couple of seconds in serialization time which is relatively speaking a big change for this workflow (from 12.5s down to 8s, i.e. ~36%). This also translate to other almost embarrassingly parallel graphs, e.g. test_set_index[1-p2p-False] is about ~10-15s faster. Nice but relatively speaking not as exciting.

Memory comparisons do not show any differences beyond noise.

Benchmark results available at

GH Actions job

static-dashboard

See dask/distributed#7564 (comment)

fjetter · 2023-03-27T15:49:07Z

alright, all tests passed, I reverted the environment files. I think we're good to go

mrocklin · 2023-03-27T15:51:49Z

🎉

…

On Mon, Mar 27, 2023 at 10:50 AM Florian Jetter ***@***.***> wrote: Merged #9988 <#9988> into main. — Reply to this email directly, view it on GitHub <#9988 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTAZWNVMERBPUWAM4JLW6GZL7ANCNFSM6AAAAAAVDDVBV4> . You are receiving this because you commented.Message ID: ***@***.***>

rjzamora · 2023-03-27T16:00:33Z

Thanks for tackling this @fjetter!

Original PR #9988 by fjetter Original: dask/dask#9988

Merged from original PR #9988 Original: dask/dask#9988

Original PR #9988 by fjetter Original: dask/dask#9988

Merged from original PR #9988 Original: dask/dask#9988

fjetter mentioned this pull request Feb 21, 2023

Use pickle for graph submissions from client to scheduler dask/distributed#7564

Merged

fjetter requested review from madsbk and rjzamora February 21, 2023 14:27

fjetter commented Feb 21, 2023

View reviewed changes

dask/layers.py Show resolved Hide resolved

fjetter commented Feb 21, 2023

View reviewed changes

fjetter commented Feb 22, 2023

View reviewed changes

fjetter mentioned this pull request Feb 23, 2023

Prepare drop packunpack #9994

Merged

fjetter force-pushed the drop_distributed_pack branch from 17ee0e3 to 385e2cf Compare March 10, 2023 15:14

fjetter commented Mar 21, 2023

View reviewed changes

fjetter force-pushed the drop_distributed_pack branch from 0cf0aac to a1543ae Compare March 21, 2023 14:45

fjetter commented Mar 21, 2023

View reviewed changes

jrbourbeau mentioned this pull request Mar 21, 2023

Release 2023.3.2 dask/community#314

Closed

6 tasks

rjzamora reviewed Mar 21, 2023

View reviewed changes

continuous_integration/environment-3.10.yaml Outdated Show resolved Hide resolved

rjzamora reviewed Mar 22, 2023

View reviewed changes

fjetter commented Mar 22, 2023

View reviewed changes

fjetter added 5 commits March 24, 2023 17:48

Drop distributed pack protocol

91b1ced

run CI against latest distributed

1b575b2

fix branch name

eea4d1b

I hate stringification

e211c77

fix split priorities for multi stage shuffle

12f1db5

fjetter force-pushed the drop_distributed_pack branch from 6682361 to 12f1db5 Compare March 24, 2023 16:48

Revert environment files

f819078

fjetter merged commit ec3ffed into dask:main Mar 27, 2023

fjetter deleted the drop_distributed_pack branch March 27, 2023 15:50

fjetter mentioned this pull request Mar 31, 2023

Optimization is slow #9795

Open

jrbourbeau mentioned this pull request Mar 31, 2023

syntax error with -annotated_prio when annotated_prio is dict dask/distributed#7737

Closed

quasiben mentioned this pull request Apr 3, 2023

Patch Release for 2023.3.2 dask/community#316

Closed

ryantqiu pushed a commit to snorkel-marlin-repos/dask_dask_pr_9988_fef89ce0-81c2-4435-90d7-8b4870eaa94c that referenced this pull request Oct 1, 2025

Drop distributed pack

d4e5b51

Original PR #9988 by fjetter Original: dask/dask#9988

ryantqiu mentioned this pull request Oct 1, 2025

Drop distributed pack snorkel-marlin-repos/dask_dask_pr_9988_fef89ce0-81c2-4435-90d7-8b4870eaa94c#1

Merged

ryantqiu added a commit to snorkel-marlin-repos/dask_dask_pr_9988_fef89ce0-81c2-4435-90d7-8b4870eaa94c that referenced this pull request Oct 1, 2025

Merge pull request #1: Drop distributed pack

5736a9e

Merged from original PR #9988 Original: dask/dask#9988

ryantqiu pushed a commit to snorkel-marlin-repos/dask_dask_pr_9988_094099c2-0caa-4e58-803c-9d5ca0706c54 that referenced this pull request Oct 2, 2025

Drop distributed pack

8e3b3e8

Original PR #9988 by fjetter Original: dask/dask#9988

ryantqiu mentioned this pull request Oct 2, 2025

Drop distributed pack snorkel-marlin-repos/dask_dask_pr_9988_094099c2-0caa-4e58-803c-9d5ca0706c54#1

Merged

ryantqiu added a commit to snorkel-marlin-repos/dask_dask_pr_9988_094099c2-0caa-4e58-803c-9d5ca0706c54 that referenced this pull request Oct 2, 2025

Merge pull request #1: Drop distributed pack

be1107b

Merged from original PR #9988 Original: dask/dask#9988

Uh oh!

Conversation

fjetter commented Feb 21, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fjetter commented Feb 23, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rjzamora left a comment

Choose a reason for hiding this comment

Uh oh!

mrocklin commented Mar 22, 2023

Uh oh!

rjzamora commented Mar 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fjetter commented Mar 22, 2023

Uh oh!

fjetter commented Mar 22, 2023

Uh oh!

rjzamora commented Mar 22, 2023

Uh oh!

jacobtomlinson commented Mar 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrocklin commented Mar 23, 2023

Uh oh!

jacobtomlinson commented Mar 23, 2023

Uh oh!

fjetter commented Mar 24, 2023

Uh oh!

rjzamora commented Mar 24, 2023

Uh oh!

fjetter commented Mar 27, 2023

Uh oh!

fjetter commented Mar 27, 2023

Uh oh!

mrocklin commented Mar 27, 2023 via email

Uh oh!

rjzamora commented Mar 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rjzamora commented Mar 22, 2023 •

edited

Loading

jacobtomlinson commented Mar 22, 2023 •

edited

Loading