Skip to content

Serialization of just coordinates #2347

@hmaarrfk

Description

@hmaarrfk

In the search for the perfect data storage mechanism, I find myself needing to store some of the images I am generating the metadata seperately. It is really useful for me to serialize just the coordinates of my DataArray.

My serialization method of choice is json since it allows me to read the metadata with just a text editor. For that, having the coordinates as a self contained dictionary is really important.

Currently, I convert just the coordinates to a dataset, and serialize that. The code looks something like this:

import xarray as xr
import numpy as np

# Setup an array with coordinates
n = np.zeros(3)
coords={'x': np.arange(3)}
m = xr.DataArray(n, dims=['x'], coords=coords)

coords_dataset_dict = m.coords.to_dataset().to_dict()
coords_dict = coords_dataset_dict['coords']

# Read/Write dictionary to JSON file

# This works, but I'm essentially creating an emtpy dataset for it
coords_set = xr.Dataset.from_dict(coords_dataset_dict)
coords2 = coords_set.coords  # so many `coords` :D
m2 = xr.DataArray(np.zeros(shape=m.shape), dims=m.dims, coords=coords2)

Would encapsulating this functionality in the Coordinates class be accepted as a PR?

It would add 2 functions that would look like:

def to_dict(self):
    # offload the heavy lifting to the Dataset class
    return self.to_dataset().to_dict()['coords']

def from_dict(self, d):
    # Offload the heavy lifting again to the Dataset class
    d_dataset = {'dims': [], 'attrs': [], 'coords': d}
    return Dataset.from_dict(d_dataset).coords

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions