New method `pad_missing` to support aggregation of DSGs

It is currently not possible to aggregate two DSG features of different lengths into a single DSG with a larger feature axis. E.g. 

```python
>>> a
<CF Field: precipitation_flux(cf_role=timeseries_id(2), ncdim%timeseries(9)) kg m-2 day-1>
>>> b
<CF Field: precipitation_flux(cf_role=timeseries_id(3), ncdim%timeseries(5)) kg m-2 day-1>
>>> cf.aggregate([a,b])
[<CF Field: precipitation_flux(cf_role=timeseries_id(2), ncdim%timeseries(9)) kg m-2 day-1>,
<CF Field: precipitation_flux(cf_role=timeseries_id(3), ncdim%timeseries(5)) kg m-2 day-1>
```
This is something we might want to do, because we can store DSGs of different lengths in one CF-netCDF data variable using a [ragged array representation](https://cfconventions.org/cf-conventions/cf-conventions.html#representations-features).

However, if we could pad out the `ncdim%timeseries` axis of `b` with missing data then we could do this with a new `pad_missing` method:

```python
>>> # Pad out the 'ncdim%timeseries' axis with missing data:
>>> #   0 elements at the start of the axis and 4 elements at the end:
>>> b = b.pad_missing('ncdim%timeseries', (0, 4))
>>> c = cf.aggregate([a,b])  # Now this aggregates
>>> c
<CF Field: precipitation_flux(cf_role=timeseries_id(5), ncdim%timeseries(9)) kg m-2 day-1>]
>>> # Compress the field
>>> c = c[0].compress('contiguous')
>>> # Write it to disk in a single CF-netCDF data variable *without* the extra padding 
>>> cf.write(c, 'dsg.nc')
```
Numpy and Dask have a `pad` method that lets you do all sorts of fancy padding, but not for missing data. Their API is also more general. As a result, it may be better to call our method `pad_missing` to discern it from the more general `pad` method, and it would always be possible to implement a full cf-python `pad` in the future if ever the need arose.

PR to follow.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New method `pad_missing` to support aggregation of DSGs #717

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New method pad_missing to support aggregation of DSGs #717

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

New method `pad_missing` to support aggregation of DSGs #717