-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
highlevelgraphIssues relating to HighLevelGraphs.Issues relating to HighLevelGraphs.needs attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.It's been a while since this was pushed on. Needs attention from the owner or a maintainer.
Description
Just a TODO that SimpleShuffleLayer and BroadcastJoinLayer both use dumps_task in their __dask_distributed_unpack__ methods, which leads to pickling on the scheduler, which we want to avoid.
Additionally, could this manual use of pickle.dumps in BroadcastJoinLayer.__dask_distributed_pack__ lead to a double-pickle scenario, where when the task actually gets executed, the kwarg values are bytestrings, not the actual unpickled values:
Lines 663 to 671 in b3a8646
| def __dask_distributed_pack__(self, *args, **kwargs): | |
| import pickle | |
| # Pickle complex merge_kwargs elements. Also | |
| # tuples, which may be confused with keys. | |
| _merge_kwargs = {} | |
| for k, v in self.merge_kwargs.items(): | |
| if not isinstance(v, (str, list, bool)): | |
| _merge_kwargs[k] = pickle.dumps(v) |
Related to dask/distributed#4699.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
highlevelgraphIssues relating to HighLevelGraphs.Issues relating to HighLevelGraphs.needs attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.It's been a while since this was pushed on. Needs attention from the owner or a maintainer.