-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Consider rechunking from chunks=(1, 100) to chunks=(100, 1) on an array of shape (100, 100).
A direct rechunk operation creates ~10,000 tasks corresponding to intermediate arrays of shape (1,1).
However, if we rechunk first to shape (10,10) (the same array size as all of our chunks) and then to our final chunk shape, there would be only ~2000 tasks, because the smallest intermediate arrays have 10 elements.
I don't know what the general rules are for how to optimize rechunkings but this makes a big difference in computational efficiency -- O(n^2) vs O(n^1.5) for square arrays with a side of length n. For n=10,000, this is a ~50x speed up (assuming that the task overhead dominates for small tasks).
Credit to @freeman-lab for this idea, which apparently works really nicely in Thunder. Jeremy, if you could share how you calculate intermediate chunks sizes that would be helpful!