Update rearrange_by_column patch for explicit comms#992
Conversation
|
@madsbk - Do you have any intuition on these CI failures |
wence-
left a comment
There was a problem hiding this comment.
More for discussion than definite change...
| shuffle_arg = kwargs.pop("shuffle", None) or dask.config.get("shuffle", "disk") | ||
| if shuffle_arg == "explicit-comms" or ( | ||
| dask.config.get("explicit-comms", False) and shuffle_arg == "tasks" | ||
| ): |
There was a problem hiding this comment.
Is this the minimally-surprising behaviour? My feeling is that the narrower specification (shuffle="tasks" as a keyword argument) should take precedence over the configuration variable.
Before this PR, there was one way to enable an explicit-comms shuffle: set dask.config["explicit-comms"] to True, which changed the meaning of shuffle="tasks" to be shuffle="explicit-comms".
Now that we are hooking in one level higher up, I think we can (unless we want to maintain backwards-compat) switch to a setup where the only way to request the explicit-comms shuffle is via dask.config["shuffle"] = "explicit-comms" or shuffle="explicit-comms", and remove the dask.config["explicit-comms"] option (unless we anticipate needing it for other things as well).
Concretely, I'm suggesting:
use_explicit_comms = kwargs.get("shuffle", dask.config.get("shuffle", None)) == "explicit-comms"
if use_explicit_comms:
try:
import distributed.worker
distributed.worker.get_client()
except (ImportError, ValueError):
warning("Requested explicit-comms shuffle, but no distributed client available, using task-based shuffle")
# slight action-at-a-distance here because this affects the call to `func` below in the (implicit) "not use_explicit_comms" branch.
kwargs["shuffle"] = "tasks"
else:
... # do the explicit-comms thing
return func(*args, **kwargs)
There was a problem hiding this comment.
Now that we are hooking in one level higher up, I think we can (unless we want to maintain backwards-compat) switch to a setup where...
Yes. I agree that we should move away from the "explicit-comms" config, but I was indeed trying to avoid a "breaking" change. With that said, I have no evidence that anyone is actually using the "explicit-comms" config in the wild. So, perhaps I was being too conservative.
Even if people are using the "explicit-comms" config, we should probably add a deprecation warning here, and tell them to use the "shuffle" option and config instead. What do you think about allowing the "explicit-comms"/"tasks" for now, but with a FutureWarning?
There was a problem hiding this comment.
Yes, let's do that (and try and get it in for 22.10 so we can immediately remove :P)
…y_column-patch-update
Codecov ReportBase: 0.00% // Head: 0.00% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## branch-22.10 #992 +/- ##
============================================
Coverage 0.00% 0.00%
============================================
Files 23 23
Lines 3102 3107 +5
============================================
- Misses 3102 3107 +5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
@gpucibot merge |
Reverts #992, which had led to unexpected issues. See rapidsai/cudf#11800 (review) cc @wence- Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #1001
Due to some unfortunate issues with #11576 and rapidsai/dask-cuda#992, I feel that these PRs should be reverted before the 22.10 release. This PRs roll back some recent changes that allow users to explicitly pass `shuffle="explicit-comms"` to certain shuffle-based algorithms. cc @wence- Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Lawrence Mitchell (https://github.com/wence-) URL: #11803
Motivated by discussions in rapidsai/cudf#11576 (primarilty here). This PR updates the
rearrange_by_columns_taskswrapper to targetrearrange_by_columnsinstead. The updated wrapper also handles the case thatshuffle="explicit-comms".