feat: nested group in ManifestStore + HDFParser#790
feat: nested group in ManifestStore + HDFParser#790maxrjones merged 18 commits intozarr-developers:mainfrom
ManifestStore + HDFParser#790Conversation
for more information, see https://pre-commit.ci
Sorry I think I should have linked to #84. What we want is to add
NetCDF4 is HDF5 in a trenchcoat 🙂 NetCDF3 cannot handle groups. TIFF can. FITS IDK. |
Good point, but I don't see any tiff tests at the moment. I'd be happy to add them if you want. But then I would need to edit the Tiff parser so perhaps this PR should be limited to the store + hdf5
Hmm, it seems adding a check for subgroups caused tests to fail. It looks like opening a group with a subgroup actually does work but just creates a dataset object at the root group if present, ignoring the other subkeys. Judging by the tests, this is intentional. So I can remove the check then that I added here. I guess the vibes well with having a new API in EDIT: done! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #790 +/- ##
==========================================
+ Coverage 87.75% 87.78% +0.03%
==========================================
Files 35 35
Lines 1886 1891 +5
==========================================
+ Hits 1655 1660 +5
Misses 231 231
🚀 New features to boost your workflow:
|
ManifestStore + HDFParser
|
Are you waiting on a review here @ilan-gold ? |
|
Yes, I just assumed the PR was "approved" because it was put in v3.0.0, milestone, but I'm ready to respond to any requested changes :) |
maxrjones
left a comment
There was a problem hiding this comment.
We need to make a couple small changes to support Zarr-Python's reliance on getting back None if a key doesn't exist for distinguishing between different node types, but that issue also exists in main so I support merging as-is.
I tried out the store changes in a local development of open_virtual_datatree and in virtual_tiff (virtual-zarr/virtual-tiff#54) and it worked great.
I don't have much experience with the HDF parser, so @TomNicholas or @sharkinsspatial may want to take another look at that before merging.
Co-authored-by: Chuck Daniels <cjdaniels4@gmail.com>
| else: | ||
| _groups: Mapping[str, ManifestGroup] = {} if groups is None else groups | ||
|
|
||
| _groups: Mapping[str, ManifestGroup] = {} if groups is None else groups |
|
|
||
| """ | ||
|
|
||
| # TODO: Remove private API `._group` |
There was a problem hiding this comment.
Are you planning to do this in a follow-up?
Some potential TODOs:
h5objects from currently parsed file formats that fit the nested paradigm and would otherwise have not workedAlong the lines of 1., maybe bring inI think this raises separate issues becauseanndataas an optional dep to test the behavior of this more deeplyanndatahas semantics around object dtype that would need to be handled by a parser, I thinkopen_virtual_datasetseems to return anxarray.Datasetbut I think in the case where things are nested, it should return aDataTreemaybe? or error out?construct_virtual_datasetand thusManifestStore.to_virtual_datasetsuffer the same issueCan the non-hdf5 formats handle nested structures, like
netCDForfits?Closes
hdfparser limited to level-0h5py.Dataset#664Tests added
Tests passing
Full type hint coverage
Changes are documented in
docs/releases.rstNew functions/methods are listed in
api.rstNew functionality has documentation