-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Using a partition_info kwarg in your map_partitions function forces the entire blockwise graph to be materialized. This can be slow, lead to slower scheduling, and eliminates optimization opportunities.
The way the injection of partition_info is currently implemented is a bit convoluted, involving manipulation of individual tasks within the materialized graph.
Since we now have the BlockwiseDep and BlockwiseDepDict interface for just this sort of purpose (including auxiliary information into a blockwise graph; concept described in #7513), this could be done more simply. The main challenge is just that it needs to be given as a kwarg, and we don't currently have the infrastructure in map_partitions to pass blockwise-y things as kwargs (xref #8308), so it may require an odd dance with a wrapper function.