Conversation
… V2 and V3 formats
for more information, see https://pre-commit.ci
How
|
|
Let me know when you would like a review of this @neilSchroeder ! |
|
@TomNicholas I think it's ready for a review. |
TomNicholas
left a comment
There was a problem hiding this comment.
Thanks for working on this @neilSchroeder ! I mostly have a bunch of small gripes 😁
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #822 +/- ##
==========================================
+ Coverage 87.71% 88.31% +0.60%
==========================================
Files 35 35
Lines 1880 1968 +88
==========================================
+ Hits 1649 1738 +89
+ Misses 231 230 -1
🚀 New features to boost your workflow:
|
|
@TomNicholas I'm ready for another review whenever you've got time |
TomNicholas
left a comment
There was a problem hiding this comment.
This looks great, thank you so much @neilSchroeder !
| strategy = get_strategy(zarr_array) | ||
| chunk_map = await strategy.get_chunk_mapping(zarr_array, path) | ||
|
|
||
| if not chunk_map: |
There was a problem hiding this comment.
Actually ignore me, I think what you're done here is good.
I'm a little unclear about order of operations and whether or not these scenarios are realistic.
These scenarios are definitely plausible.
Or maybe handled differently?
There might be a way to refactor to have a few fewer levels of functions, but this is good.
|
@TomNicholas what are the next steps here? Will this be merged whenever someone has time to do the next release? Do we need another reviewer? |
Checklist
docs/releases.mdHow
zarr.pyHandles Zarr V2 StoresZarrParsershould now support both Zarr V2 and V3 stores by normalizing V2 stores to appear as V3. This approach ensures that all parsers produce V3-compatible outputs, and confines modifications tozarr.py.V2 → V3 Normalization Strategy
The parser performs a two-part normalization:
1. Chunk Key Mapping (
get_chunk_mapping_prefix)For V2 arrays:
array_name/0,array_name/0.1.22. Metadata Conversion (
get_metadata())After converting V2 metadata to V3 using
_convert_array_metadata, we have to replace thechunk_key_encoding.V2ChunkKeyEncodingin the V3 metadatazarr/xarrayseesV2ChunkKeyEncoding, it requests chunks using V2-style paths:array/0DefaultChunkKeyEncoding,zarrrequests chunks using V3-style paths:array/c/0ManifestStore.get()expects V3-style paths and usesparse_manifest_index()to extract chunk coordinatesparse_manifest_index()requires the/c/component to correctly parse the pathAdditional metadata handling
_ARRAY_DIMENSIONSattribute or generated as{array_name}_dim_{i}zarr's standard V2→V3 migration utilitiesImplementation Notes
I'm not convinced I've done a particularly elegant implementation here, but adding another class for V2 parsing didn't seem like it would be particularly extensible. Very happy to hear thoughts on perhaps a better implementation.
@TomNicholas thank you very much for your feedback, it definitely helped me wrap my head around the right approach to take here.
Edit: I've done a bit of re-design to use a strategy pattern for dispatching to parsing v2 and v3 arrays. This should make future integrations of zarr array version parsing a lot more maintainable. This is also just a lot easier to read than my original implementation. Tests and documentation are also up to date.