-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
What happened:
Hello! After the 2.28.0 release the napari project is having some dask performance related issues, including timeout fails of some of tests that now take significantly longer. Changing the dask config did not recover the old behavior as I would have expected based on the docs.
A more detailed investigation can be found in the napari issue tracker here napari/napari#1656 but I will reproduce a minimal example below.
What you expected to happen:
I didn't expect such large slow downs with the new release.
Minimal Complete Verifiable Example:
import dask.array as da
import numpy as np
import time
data = da.random.random(
size=(100_000, 1000, 1000), chunks=(1, 1000, 1000)
)
idxs = [(0,), (50_000,), (99_999,)]
t0 = time.time()
reduced_data = np.min([np.min(data[idx]) for idx in idxs])
t1 = time.time()
print(t1 - t0)On 2.27.0 this takes about 0.13 seconds to run, on 2.28.0 this takes 4.3 seconds to run.
Anything else we need to know?:
Looking at the 2.28.0 release notes I saw #6665 and think this maybe related to that. I noticed the note on efficiency which suggested there may now be some additional overhead, but I didn't expect the slow down to be so large (note while the above is just a toy example, the slowdown is quite noticeable for real-world examples too).
I tried using
with dask.config.set({"array.slicing.split-large-chunks": False}):
data = da.random.random(
size=(100_000, 1000, 1000), chunks=(1, 1000, 1000)
)
idxs = [(0,), (50_000,), (99_999,)]
t0 = time.time()
reduced_data = np.min([np.min(data[idx]) for idx in idxs])
t1 = time.time()
print(t1 - t0)but I saw no difference between having the setting True or False in both this toy example, or in the real config inside napari if I added it to napari here. If I could get the config working as expected then I can just change the default value and we would be fine.
Environment:
- Dask version: 2.28.0
- Python version: 3.7
- Operating System: MacOS
- Install method (conda, pip, source): pip