Skip to content

Consider supporting pathlib.Path within Dask Arrays .to_zarr(...) method #8006

@asmith26

Description

@asmith26

Not sure if this should be raised in maybe the zarr repo, and I've looked through previous issues and couldn't find anything regarding this - it's feels like the Dask Array .to_zarr(...) method could support pathlib.Path (directly, i.e. without having to turn it into a str):

Dask Version: 2021.7.2

>>> import pathlib, dask.array as da,  numpy as np
>>> da.from_array(np.array([1])).to_zarr("file.zarr")  # works
>>> da.from_array(np.array([1])).to_zarr(pathlib.Path("file.zarr"))  # yields:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File ".../lib/python3.8/site-packages/dask/array/core.py", line 2649, in to_zarr
    return to_zarr(self, *args, **kwargs)
  File ".../lib/python3.8/site-packages/dask/array/core.py", line 3381, in to_zarr
    z = zarr.create(
  File ".../lib/python3.8/site-packages/zarr/creation.py", line 136, in create
    init_array(store, shape=shape, chunks=chunks, dtype=dtype, compressor=compressor,
  File ".../lib/python3.8/site-packages/zarr/storage.py", line 352, in init_array
    _init_array_metadata(store, shape=shape, chunks=chunks, dtype=dtype,
  File ".../lib/python3.8/site-packages/zarr/storage.py", line 382, in _init_array_metadata
    elif contains_array(store, path):
  File ".../lib/python3.8/site-packages/zarr/storage.py", line 96, in contains_array
    return key in store
TypeError: argument of type 'PosixPath' is not iterable

In particular I've noticed that Dask DataFrame .to_parquet(...) supports this:

>>> import pathlib, dask.dataframe as dd, pandas as pd
>>> dd.from_pandas(pd.DataFrame({"a": [1]}), npartitions=1).to_parquet(pathlib.Path("file.parquet"))  # works

Many thanks for these awesome libs!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions