-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
PerformanceWarning in da.to_zarr is misleading and stops (correct?) execution #12160
Copy link
Copy link
Closed
Closed
Copy link
Labels
needs triageNeeds a response from a contributorNeeds a response from a contributor
Description
Describe the issue:
Code execution stops in the example below with
PerformanceWarning: The input Dask array will be rechunked along axis 0 with chunk size 8, but a chunk size divisible by 5 is required for Dask to write safely to the Zarr array <Array memory://4933780544 shape=(8,) dtype=float64>. To avoid risk of data loss when writing to this Zarr array, set the "array.chunk-size" configuration parameter to at least the size in bytes of a single on-disk chunk (or shard) of the Zarr array, which in this case is 40 bytes. E.g., dask.config.set({"array.chunk-size": 40})
Problems:
- What probably should be a warning is raised as an exception and stops execution:
Line 3911 in cd5e429
| raise PerformanceWarning(msg) |
- In the shown example I think the warning doesn't make sense, since
region=is used and in this case the dask and zarr chunks align perfectly.
Minimal Complete Verifiable Example:
import dask.array as da
import zarr
output_zarr_array = zarr.create(
shape=(8,),
chunks=(5,),
overwrite=True, # Allows overwriting if the path exists
)
x = da.ones((5,), chunks=(5,))
da.to_zarr(
x,
output_zarr_array,
region=tuple([slice(0, 5)]),
)
# performance warning RAISEDAnything else we need to know?:
Environment:
- Dask version: 2025.11.0
- Python version: 3.12
- Operating System: macos
- Install method (conda, pip, source): pip
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
needs triageNeeds a response from a contributorNeeds a response from a contributor