Skip to content

Fix execution stop in da.to_zarr due to (misleading) PerformanceWarning raised as exception#12161

Merged
dcherian merged 6 commits intodask:mainfrom
m-albert:performance_warning_fix
Nov 18, 2025
Merged

Fix execution stop in da.to_zarr due to (misleading) PerformanceWarning raised as exception#12161
dcherian merged 6 commits intodask:mainfrom
m-albert:performance_warning_fix

Conversation

@m-albert
Copy link
Copy Markdown
Contributor

On main, the following code leads to an exception:

import dask.array as da
import zarr

output_zarr_array = zarr.create(
    shape=(8,),
    chunks=(5,),
    overwrite=True,
)

x = da.ones((5,), chunks=(5,))

da.to_zarr(
    x,
    output_zarr_array,
    region=tuple([slice(0, 5)]),
    )

PerformanceWarning: The input Dask array will be rechunked along axis 0 with chunk size 8, but a chunk size divisible by 5 is required for Dask to write safely to the Zarr array <Array memory://4933780544 shape=(8,) dtype=float64>. To avoid risk of data loss when writing to this Zarr array, set the "array.chunk-size" configuration parameter to at least the size in bytes of a single on-disk chunk (or shard) of the Zarr array, which in this case is 40 bytes. E.g., dask.config.set({"array.chunk-size": 40})

This PR

  1. Prevents execution stop by replacing raise by warnings.warn
  2. Takes region= into account for the performance warning.

@m-albert m-albert changed the title Performance warning fix Fix execution stop in da.to_zarr due to (misleading) PerformanceWarning raised as exception Nov 17, 2025
@dcherian dcherian requested a review from d-v-b November 17, 2025 22:33
@d-v-b
Copy link
Copy Markdown
Member

d-v-b commented Nov 17, 2025

oops, sorry about this, and thanks for fixing it!

@m-albert
Copy link
Copy Markdown
Contributor Author

Thanks for looking at this @d-v-b !

Realised I needed to adapt the tests a bit.

@github-actions
Copy link
Copy Markdown
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  ±0        9 suites  ±0   3h 10m 10s ⏱️ - 6m 23s
 18 154 tests ±0   16 938 ✅  - 1   1 215 💤 ±0  1 ❌ +1 
162 614 runs  ±0  150 521 ✅  - 1  12 092 💤 ±0  1 ❌ +1 

For more details on these failures, see this check.

Results for commit 7a4dcd3. ± Comparison against base commit cd5e429.

@m-albert
Copy link
Copy Markdown
Contributor Author

After some follow-up commits this is ready from my side. One of the tests is failing, which however seems unrelated.

@dcherian
Copy link
Copy Markdown
Collaborator

Failure appears to be unrelated: dask/dataframe/dask_expr/tests/test_collection.py::test_serialization - assert 2742 < (350 + 2316)

Thanks @m-albert !

@dcherian dcherian merged commit 9b42144 into dask:main Nov 18, 2025
22 of 24 checks passed
@m-albert
Copy link
Copy Markdown
Contributor Author

Thanks @dcherian!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PerformanceWarning in da.to_zarr is misleading and stops (correct?) execution

3 participants