-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Changes introduced in v3.16.2 in this commit as part of #760 has caused calls to the subset command to fail when the subset produces indices for (at least?) one cyclic axis in the form of a slice which doesn't satisfy the condition
step > 0 and start < 0 and stop > 0 (e.g. slice(-3, 0, 1)) [ CASE 1a ]
OR
step < 0 and start > 0 and stop < 0 (e.g. slice(0, 2, -1)) [ CASE 1b ]
or does satisfy the condition:
step > 0 and start < 0 and stop == -size (e.g. slice(-1,-3,1), size=3) [ CASE 2a ]
OR
step < 0 and stop < 0 and start == -size (e.g. slice(-3, -1, -1), size=3) [ CASE 2b ]
This arises from the following new code in cf.functions (the rest of the modifications to cf.functions in this commit are just refactors as far as I can tell):
if not (
(step > 0 and start < 0 and stop > 0)
or (step < 0 and start > 0 and stop < 0)
):
raise IndexError(f"{index!r} is not a cyclic slice")
in cf.functions.normalize_slice. This function determines whether or not a slice is cyclic, and these lines are designed to catch the cyclic cases after 'normalization' of the slice's start and stop indices by previous lines of the function. In cases 1a and 1b the normalization is skipped as the slices don't satisfy the conditions, in cases 2a and 2b, normalization is performed modifying the start or stop indices of the slice to 0. In both cases an IndexError is now raised by the function causing further modification of the slice and generation of the 'roll' variable in cf.functions.parse_indices to be skipped, whereas in v3.16.1 this modification would still take place, e.g. changing a slice of (-3, 0, 1) to (0, 3, 1) and generating roll = {:-3}
If the condition is modified to
if not (
(step > 0 and start < 0 and stop >**=** 0)
or (step < 0 and start >**=** 0 and stop < 0)
):
raise IndexError(f"{index!r} is not a cyclic slice")
the function correctly determines the slice to be cyclic and thus the slice is modified appropriately.
Without this modification, an object with a dimension of size 0 is returned when __getitem__ is called on a field's Data object when __getitem__ is called on a field as part of the cf.field.subset process, and thus the following error is raised in cf.field.__getitem__:
IndexError: Indices [array([0, 1, 2, 3]), slice(-3, 0, 1)] result in a subspaced shape of (4, 0), but can't create a subspace of Field that has a size 0 axis
I have run the same example in a cf v3.16.1 and v3.16.2 environment, with debugging enabled and some additional print statements thrown in for good-measure:
# Create example field
example = cf.Field()
axis_ = example.set_construct(cf.DomainAxis(4))
example.set_construct(
cf.DimensionCoordinate(
properties={
"standard_name": 'latitude',
"units": 'degrees_north',
"axis": 'Y',
},
data=cf.Data([-67.5, -22.5, 22.5, 67.5]),
bounds=cf.Bounds(data=cf.Data(
np.array([[-90., -45.],
[-45., 0.],
[0., 45.],
[45., 90.]]))),
),
axes=axis_,
)
axis_ = example.set_construct(cf.DomainAxis(3))
example.set_construct(
cf.DimensionCoordinate(
properties={
"standard_name": 'longitude',
"units": 'degrees_east',
"axis": 'X',
},
data=cf.Data([-120., 0., 120.]),
bounds=cf.Bounds(data=cf.Data(
np.array([[-180., -60.],
[-60., 60.],
[60., 180.]]))),
),
axes=axis_,
)
example.set_data(cf.Data(np.zeros((4,3))), axes=('Y', 'X'))```
kwargs = {
'longitude': cf.wi(-120, 120),
'latitude': cf.wi(-67.5, 67.5),
}
Try with cf v3.16.2
example_subset = example.subspace(**kwargs)
field:
Field:
-------
Data : (latitude(4), longitude(3))
Dimension coords: latitude(4) = [-67.5, ..., 67.5] degrees_north
: longitude(3) = [-120.0, 0.0, 120.0] degrees_east
config:
()
data_axes:
('domainaxis0', 'domainaxis1')
parsed = {('domainaxis1',): [(('domainaxis1',), 'dimensioncoordinate1', <CF DimensionCoordinate: longitude(3) degrees_east>, <CF Query: (wi [-120.0, 120.0])>, 'longitude')], ('domainaxis0',): [(('domainaxis0',), 'dimensioncoordinate0', <CF DimensionCoordinate: latitude(4) degrees_north>, <CF Query: (wi [-67.5, 67.5])>, 'latitude')]}
unique_axes = {'domainaxis1', 'domainaxis0'}
n_axes = 2
item_axes = ('domainaxis1',)
keys = ('dimensioncoordinate1',)
1 1-d constructs: (<CF DimensionCoordinate: longitude(3) degrees_east>,)
axis = 'domainaxis1'
value = <CF Query: (wi [-120.0, 120.0])>
identity = 'longitude'
1-d CASE 2:
index = slice(-3, 0, 1)
ind = None
create_mask = False
item_axes = ('domainaxis0',)
keys = ('dimensioncoordinate0',)
1 1-d constructs: (<CF DimensionCoordinate: latitude(4) degrees_north>,)
axis = 'domainaxis0'
value = <CF Query: (wi [-67.5, 67.5])>
identity = 'latitude'
1-d CASE 3:
index = [0 1 2 3]
ind = None
create_mask = False
indices = {'indices': {'domainaxis0': array([0, 1, 2, 3]), 'domainaxis1': slice(-3, 0, 1)}, 'mask': {}}
domain_indices:
{'indices': {'domainaxis0': array([0, 1, 2, 3]), 'domainaxis1': slice(-3, 0, 1)}, 'mask': {}}
axis_indices:
{'domainaxis0': array([0, 1, 2, 3]), 'domainaxis1': slice(-3, 0, 1)}
indices:
[array([0, 1, 2, 3]), slice(-3, 0, 1)]
indices function return:
(array([0, 1, 2, 3]), slice(-3, 0, 1))
Indices:
(array([0, 1, 2, 3]), slice(-3, 0, 1))
Field.__getitem__
input indices = (array([0, 1, 2, 3]), slice(-3, 0, 1))
shape = (4, 3)
indices = [array([0, 1, 2, 3]), slice(-3, 0, 1)]
indices2 = (array([0, 1, 2, 3]), slice(-3, 0, 1))
findices = [array([0, 1, 2, 3]), slice(-3, 0, 1)]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/tmp/ipykernel_870/3059849535.py in ?()
----> 1 field_subset = directions.subspace(**kwargs)
~/miniconda3/envs/unifhytest/lib/python3.12/site-packages/cf/subspacefield.py in ?(self, *config, **kwargs)
355 except (ValueError, IndexError) as error:
356 if test:
357 return False
358
--> 359 raise error
360 else:
361 if test:
362 return True
~/miniconda3/envs/unifhytest/lib/python3.12/site-packages/cf/subspacefield.py in ?(self, *config, **kwargs)
355 except (ValueError, IndexError) as error:
356 if test:
357 return False
358
--> 359 raise error
360 else:
361 if test:
362 return True
~/miniconda3/envs/unifhytest/lib/python3.12/site-packages/cf/field.py in ?(self, indices)
428
429 new_data = data[tuple(findices)]
430
431 if 0 in new_data.shape:
--> 432 raise IndexError(
433 f"Indices {findices!r} result in a subspaced shape of "
434 f"{new_data.shape}, but can't create a subspace of "
435 f"{self.__class__.__name__} that has a size 0 axis"
IndexError: Indices [array([0, 1, 2, 3]), slice(-3, 0, 1)] result in a subspaced shape of (4, 0), but can't create a subspace of Field that has a size 0 axis
Try with cf v3.16.1.
Ignore the fact that the other axis in indices is now a dask array instead of a np.array or list, I don't think that is relevant.
example_subset = example.subspace(**kwargs)
field:
Field:
-------
Data : (latitude(4), longitude(3))
Dimension coords: latitude(4) = [-67.5, ..., 67.5] degrees_north
: longitude(3) = [-120.0, 0.0, 120.0] degrees_east
config:
()
data_axes:
('domainaxis0', 'domainaxis1')
parsed = {('domainaxis1',): [(('domainaxis1',), 'dimensioncoordinate1', <CF DimensionCoordinate: longitude(3) degrees_east>, <CF Query: (wi [-120, 120])>, 'longitude')], ('domainaxis0',): [(('domainaxis0',), 'dimensioncoordinate0', <CF DimensionCoordinate: latitude(4) degrees_north>, <CF Query: (wi [-67.5, 67.5])>, 'latitude')]}
unique_axes = {'domainaxis0', 'domainaxis1'}
n_axes = 2
item_axes = ('domainaxis1',)
keys = ('dimensioncoordinate1',)
1 1-d constructs: (<CF DimensionCoordinate: longitude(3) degrees_east>,)
axis = 'domainaxis1'
value = <CF Query: (wi [-120, 120])>
identity = 'longitude'
1-d CASE 2:
index = slice(-3, 0, 1)
ind = None
create_mask = False
item_axes = ('domainaxis0',)
keys = ('dimensioncoordinate0',)
1 1-d constructs: (<CF DimensionCoordinate: latitude(4) degrees_north>,)
axis = 'domainaxis0'
value = <CF Query: (wi [-67.5, 67.5])>
identity = 'latitude'
1-d CASE 3:
index = dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>
ind = None
create_mask = False
indices = {'indices': {'domainaxis0': dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, 'domainaxis1': slice(-3, 0, 1)}, 'mask': {}}
domain_indices:
{'indices': {'domainaxis0': dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, 'domainaxis1': slice(-3, 0, 1)}, 'mask': {}}
axis_indices:
{'domainaxis0': dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, 'domainaxis1': slice(-3, 0, 1)}
indices:
[dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(-3, 0, 1)]
indices function return:
(dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(-3, 0, 1))
Indices:
(dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(-3, 0, 1))
Field.__getitem__
input indices = (dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(-3, 0, 1))
shape = (4, 3)
**##### NOTE CHANGED SLICE IN INDICES FROM HERE ON**
indices = [dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(0, 3, 1)]
indices2 = (dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(-3, 0, 1))
findices = [dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>, slice(0, 3, 1)]
Computing data shape: Performance may be degraded
dice = (dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>,)
DimensionCoordinate.__getitem__: shape = (4,)
DimensionCoordinate.__getitem__: indices2 = (dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>,)
DimensionCoordinate.__getitem__: indices = [dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>]
DimensionCoordinate.__getitem__: findices = (dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>,)
Computing data shape: Performance may be degraded
DimensionCoordinate.__getitem__: findices for bounds = (dask.array<and_, shape=(4,), dtype=bool, chunksize=(4,), chunktype=numpy.ndarray>,)
Computing data shape: Performance may be degraded
dice = (slice(0, 3, 1),)
DimensionCoordinate.__getitem__: shape = (3,)
DimensionCoordinate.__getitem__: indices2 = (slice(0, 3, 1),)
DimensionCoordinate.__getitem__: indices = [slice(0, 3, 1)]
DimensionCoordinate.__getitem__: findices = (slice(0, 3, 1),)
DimensionCoordinate.__getitem__: findices for bounds = (slice(0, 3, 1),)
example_subset
<CF Field: (latitude(4), longitude(3))>
There may be other cases I've missed, but I thought this a suitable place to stop digging and ask for input!! I can see why the cases highlighted are not considered cyclic...
A cyclic variable x = [0,1,2] with a slice of (-3, 0, 1) should produce [0,1,2], which happens in v3.16.1 as expected but not in v3.16.2 as explained.
The outputs from the cf.environment calls are:
For v3.16.2:
Platform: Linux-3.10.0-1160.114.2.el7.x86_64-x86_64-with-glibc2.34
HDF5 library: 1.14.3
netcdf library: 4.9.2
udunits2 library: /home/users/mattjbr/miniconda3/envs/unifhytest/lib/libudunits2.so.0
esmpy/ESMF: 8.6.1
Python: 3.12.3
dask: 2024.4.2
netCDF4: 1.6.5
psutil: 5.9.8
packaging: 24.0
numpy: 1.26.4
scipy: 1.13.0
matplotlib: not available
cftime: 1.6.3
cfunits: 3.3.7
cfplot: not available
cfdm: 1.11.1.0
cf: 3.16.2
For v3.16.1:
Platform: Linux-3.10.0-1160.114.2.el7.x86_64-x86_64-with-glibc2.34
HDF5 library: 1.14.3
netcdf library: 4.9.2
udunits2 library: /home/users/mattjbr/miniconda3/envs/unifhytest2/lib/libudunits2.so.0
esmpy/ESMF: 8.6.1
Python: 3.12.3
dask: 2024.5.0
netCDF4: 1.6.5
psutil: 5.9.8
packaging: 24.0
numpy: 1.26.4
scipy: 1.13.0
matplotlib: not available
cftime: 1.6.3
cfunits: 3.3.7
cfplot: not available
cfdm: 1.11.1.0
cf: 3.16.1