-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
It would be convenient to load constituent dask.arrays into memory as dask.arrays rather than as numpy arrays. This would help with distributed computations where we want to load a large amount of data into distributed memory once and then iterate on the full xarray dataset repeatedly without reloading from disk every time.
We can probably solve this from either side:
- XArray could make a
.persistmethod that replaced all of its dask.arrays with a persisted version of that array
import dask
dset.x, dset.y, dset.z = dask.persist(dset.x, dset.y, dset.z)- We could look into the Dask duck type solution again Expected interface for dask objects dask/dask#1068
Reactions are currently unavailable