-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Labels
Description
I noticed that some join methods have things like
dsk = {
(name, i): (apply, merge_chunk, [left_key, right_key], kwargs)
for i, right_key in enumerate(right.__dask_keys__())
}where we're generating a low-level graph that could just be done with map_partitions. Using map_partitions in these scenarios would both speed up graph transmission and allow for blockwise fusion across the operations. Refactoring this simple sorts of graphs should be straightforward.
-
single_partition_join -
hash_join'smerge_chunk -
stack_partitionsshould useHighLevelGraph.from_collectionsinstead of merging all of the input graphs
Reactions are currently unavailable