Skip to content

rasterio backend should use DataStorePickleMixin (or something similar) #2121

@jhamman

Description

@jhamman

Code Sample, a copy-pastable example if possible

In [1]: import xarray as xr

In [2]: ds = xr.open_rasterio('RGB.byte.tif')

In [3]: ds
Out[3]:
<xarray.DataArray (band: 3, y: 718, x: 791)>
[1703814 values with dtype=uint8]
Coordinates:
  * band     (band) int64 1 2 3
  * y        (y) float64 2.827e+06 2.826e+06 2.826e+06 2.826e+06 2.826e+06 ...
  * x        (x) float64 1.021e+05 1.024e+05 1.027e+05 1.03e+05 1.033e+05 ...
Attributes:
    transform:   (101985.0, 300.0379266750948, 0.0, 2826915.0, 0.0, -300.0417...
    crs:         +init=epsg:32618
    res:         (300.0379266750948, 300.041782729805)
    is_tiled:    0
    nodatavals:  (0.0, 0.0, 0.0)

In [4]: import pickle

In [5]: pickle.dumps(ds)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-a165c2473431> in <module>()
----> 1 pickle.dumps(ds)

TypeError: can't pickle rasterio._io.RasterReader objects

Problem description

Originally reported by @rsignell-usgs in pangeo-data/pangeo#249 (comment), the rasterio backend is not pickle-able. This obviously causes problems when using dask-distributed. We probably need to use DataStorePickleMixin or something similar on rasterio datasets to allow multiple readers of the same dataset.

Expected Output

pickle.dumps(ds)

returns a pickled dataset.

Output of xr.show_versions()

Details xr.show_versions() /Users/jhamman/anaconda/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

xarray: 0.10.3
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.1
netCDF4: 1.3.1
h5netcdf: 0.5.1
h5py: 2.7.1
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.2
distributed: 1.21.6
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: 0.8.1
setuptools: 39.0.1
pip: 9.0.3
conda: 4.5.1
pytest: 3.5.1
IPython: 6.3.1
sphinx: 1.7.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions