Include serializer from ManifestStore in Icechunk virtual references#766
Include serializer from ManifestStore in Icechunk virtual references#766TomNicholas merged 5 commits intozarr-developers:mainfrom
Conversation
|
@maxrjones I ran my workflow and still got gibberish: https://nbviewer.org/gist/rsignell/32f563cfbe83c20ac01aba00190e5912 |
Yeah, this fix is only for Icechunk. Kerchunk uses Zarr specification 2, which will require separate fix or adapting our Kerchunk writer to use Zarr format 3. I'd be tempted to go with the second option because converting from Zarr V3 to V2 for Kerchunk adds a lot of surface area for potential bugs, but that would be a separate PR. |
|
@maxrjones cool, I wanted to use icechunk anyway! And as you said, it does solve my use case: |
TomNicholas
left a comment
There was a problem hiding this comment.
So to check my understanding, the reason this failed was because the default array->bytes codec (i.e. the "serializer") in zarr v3 terms was for little endian? Which is fine until you use it to decode big-endian data, when it will give you gibberish?
That's right. We still likely have bugs because a user can define their own codec that doesn't subclass from Zarr python. In that case, their custom codec will be dropped by VirtualiZarr/virtualizarr/codecs.py Lines 49 to 63 in f3149d6 I expect a lot of our codec handling will get much easier with @d-v-b's refactor in Zarr-Python, which is why I have been proposing small patch fixes rather than more fundamental changes. |
Could we at least raise in that scenario? |
TomNicholas
left a comment
There was a problem hiding this comment.
Add a changelog entry and this is good to go, thank you!
yeah we definitely should raise in that scenario. |
I think we need much more work around our handling of codecs (which would get much easier with zarr-developers/zarr-python#3276), but this is a more immediate fix to support parsers that specify any serializer other than the default (e.g., big-endian bytes).
This makes your NetCDF3 example work @rsignell
docs/releases.rstapi.rst