Disable locking in to_zarr (needed for using to_zarr in a distributed context)#3607
Disable locking in to_zarr (needed for using to_zarr in a distributed context)#3607mrocklin merged 2 commits intodask:masterfrom
Conversation
Appears that locking was enabled in `to_zarr` using the default `threading.Lock` object. This doesn't work well in Distributed context. Given we already ensure the Dask Array will have proper chunking and Zarr will match it, there is no need to have any lock. So simply disable locking completely.
|
Good catch. |
|
Thanks. If you have time to review, that would be appreciated. |
|
+1 |
dask/array/tests/test_array_core.py
Outdated
| a.to_zarr(d) | ||
| a2 = da.from_zarr(d) | ||
| assert_eq(a, a2) | ||
| assert a2.chunks == a.chunks |
There was a problem hiding this comment.
This should probably live in tests/test_distributed.py
Also, you should copy the models already in that file, or else read through http://distributed.readthedocs.io/en/latest/develop.html#writing-tests
When you start creating thousands of dask clusters and tornado IOLoops in a process weird things happen if you aren't careful. The testing infrastructure in that doc page includes some well-trusted mechanisms.
There was a problem hiding this comment.
FWIW I found this usage pattern in that file. Not attached to the usage pattern or which file within Dask this lives. Just noting that usage pattern already exists in the code base.
There was a problem hiding this comment.
To be clear, I'm referring to creating a bare Client. Is this also what you're referring to? If so can you point me to where this is happening? I'm not seeing this happen within the test_distributed.py file. We always use the cluster context manager and the loop pytest fixture.
There was a problem hiding this comment.
Sorry, was incorrect about the file. Found it here.
ref: https://github.com/dask/dask/blob/0.17.5/dask/bytes/tests/test_hdfs.py#L300-L312
There was a problem hiding this comment.
In any event, have moved this to Distributed and used the with cluster() syntax. Please let me know your thoughts on it.
There was a problem hiding this comment.
Ah, ok, thanks for hunting that down.
Anyway, I strongly suggest that we stick to the pattern in the dask.distributed documentation, or in the test_distributed.py file
There was a problem hiding this comment.
Sure, I agree. Only used it as it was the first thing I found and did not know what I was doing. :)
There was a problem hiding this comment.
Added a PR ( #3609 ) to try and refresh that test.
Make sure that it is possible to store to an on disk Zarr Array in a Distributed context.
540acb4 to
e3b2df5
Compare
| a.to_zarr(d) | ||
| a2 = da.from_zarr(d) | ||
| assert_eq(a, a2) | ||
| assert a2.chunks == a.chunks |
There was a problem hiding this comment.
Thanks for checking :)
|
+1 from @martindurant . Merging |
|
Thanks all :) |
Appears that
to_zarrwas defaulting tostore's behavior of using athreading.Lockobject for each chunk written. As thethreading.Lockobject is not pickleable, this fails when using Distributed. Since we already guarantee the chunks are aligned, just disable any kind of locking into_zarr. Includes a test using a small Distributed cluster that fails without this change and passes with it.flake8 dask