-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I have an HDF5 dataset with a scalar variable called 'name' that is actual a 0-D NumPy array with dtype '|S8'. (Not my choice, this is what I get from someone else...) Occasionally, the loading fails.
MCVE Code Sample
#Set up the file
import h5py
f = h5py.File("error_demo.h5",mode='w')
f.create_dataset('name',shape=(),dtype="|S8",data=np.array([b'f(Pt,TE)'],dtype='|S8'))
f.close()
#Produce the error -- you may need to adjust the number of times you run the loop
import xarray as xr
for i in range(10):
xr.load_dataset("error_demo.h5")Expected Output
<xarray.Dataset>
Dimensions: ()
Data variables:
name <U8 'f(Pt,TE)'
Problem Description
The resulting error message
Traceback (most recent call last):
File "<ipython-input-3-b8e48f28a262>", line 1, in <module>
mcout62 = xr.load_dataset("57062/mcout000011.h5",group=r"part/ions/dE(r,z,D)")
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/api.py", line 261, in load_dataset
return ds.load()
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/dataset.py", line 659, in load
v.load()
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/variable.py", line 375, in load
self._data = np.asarray(self._data)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/indexing.py", line 677, in __array__
self._ensure_cached()
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/indexing.py", line 674, in _ensure_cached
self.array = NumpyIndexingAdapter(np.asarray(self.array))
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/indexing.py", line 653, in __array__
return np.asarray(self.array, dtype=dtype)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/numpy/core/_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/indexing.py", line 557, in __array__
return np.asarray(array[self.key], dtype=None)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 73, in __getitem__
key, self.shape, indexing.IndexingSupport.OUTER, self._getitem
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/core/indexing.py", line 837, in explicit_indexing_adapter
result = raw_indexing_method(raw_key.tuple)
File "/Users/lmorton/opt/anaconda3/lib/python3.7/site-packages/xarray/backends/netCDF4_.py", line 85, in _getitem
array = getitem(original_array, key)
File "netCDF4/_netCDF4.pyx", line 4408, in netCDF4._netCDF4.Variable.__getitem__
File "netCDF4/_netCDF4.pyx", line 5384, in netCDF4._netCDF4.Variable._get
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 9: invalid start byte
Versions
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.6 (default, Jan 8 2020, 13:42:34)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 19.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.7.3
xarray: 0.15.0
pandas: 1.0.1
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.11.0
distributed: 2.11.0
matplotlib: 3.1.3
cartopy: None
seaborn: 0.10.0
numbagg: None
setuptools: 46.0.0.post20200309
pip: 20.0.2
conda: 4.8.3
pytest: 5.3.5
IPython: 7.12.0
sphinx: 2.4.0