Skip to content

Enforce dtype=object for incompatible numpy array conversion#417

Merged
martindurant merged 2 commits intozarr-developers:mainfrom
bnavigator:inferred-np
Jan 13, 2023
Merged

Enforce dtype=object for incompatible numpy array conversion#417
martindurant merged 2 commits intozarr-developers:mainfrom
bnavigator:inferred-np

Conversation

@bnavigator
Copy link
Contributor

@bnavigator bnavigator commented Jan 12, 2023

Numpy 1.24 does not accept ragged sequences without dtype=object anymore: numpy/numpy#22004

Fixes #333

Before (#333 (comment) and #416):

[   51s] ____________________________ test_non_numpy_inputs _____________________________
[   51s] 
[   51s]     def test_non_numpy_inputs():
[   51s]         # numpy will infer a range of different shapes and dtypes for these inputs.
[   51s]         # Make sure that round-tripping through encode preserves this.
[   51s]         data = [
[   51s]             [0, 1],
[   51s]             [[0, 1], [2, 3]],
[   51s]             [[0], [1], [2, 3]],
[   51s]             [[[0, 0]], [[1, 1]], [[2, 3]]],
[   51s]             ["1"],
[   51s]             ["11", "11"],
[   51s]             ["11", "1", "1"],
[   51s]             [{}],
[   51s]             [{"key": "value"}, ["list", "of", "strings"]],
[   51s]         ]
[   51s]         for input_data in data:
[   51s]             for codec in codecs:
[   51s] >               output_data = codec.decode(codec.encode(input_data))
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/tests/test_json.py:72: 
[   51s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   51s] 
[   51s] self = JSON(encoding='utf-8', allow_nan=True, check_circular=True, ensure_ascii=True,
[   51s]      indent=None, separators=(',', ':'), skipkeys=False, sort_keys=True,
[   51s]      strict=True)
[   51s] buf = [[0], [1], [2, 3]]
[   51s] 
[   51s]     def encode(self, buf):
[   51s] >       buf = np.asarray(buf)
[   51s] E       ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/json.py:57: ValueError
[   51s] ____________________________ test_non_numpy_inputs _____________________________
[   51s] 
[   51s]     def test_non_numpy_inputs():
[   51s]         codec = MsgPack()
[   51s]         # numpy will infer a range of different shapes and dtypes for these inputs.
[   51s]         # Make sure that round-tripping through encode preserves this.
[   51s]         data = [
[   51s]             [0, 1],
[   51s]             [[0, 1], [2, 3]],
[   51s]             [[0], [1], [2, 3]],
[   51s]             [[[0, 0]], [[1, 1]], [[2, 3]]],
[   51s]             ["1"],
[   51s]             ["11", "11"],
[   51s]             ["11", "1", "1"],
[   51s]             [{}],
[   51s]             [{"key": "value"}, ["list", "of", "strings"]],
[   51s]             [b"1"],
[   51s]             [b"11", b"11"],
[   51s]             [b"11", b"1", b"1"],
[   51s]             [{b"key": b"value"}, [b"list", b"of", b"strings"]],
[   51s]         ]
[   51s]         for input_data in data:
[   51s] >           actual = codec.decode(codec.encode(input_data))
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/tests/test_msgpacks.py:75: 
[   51s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   51s] 
[   51s] self = MsgPack(raw=False, use_bin_type=True, use_single_float=False)
[   51s] buf = [[0], [1], [2, 3]]
[   51s] 
[   51s]     def encode(self, buf):
[   51s] >       buf = np.asarray(buf)
[   51s] E       ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.
[   51s] 
[   51s] ../../BUILDROOT/python-numcodecs-0.11.0-0.x86_64/usr/lib64/python3.8/site-packages/numcodecs/msgpacks.py:55: ValueError

@codecov
Copy link

codecov bot commented Jan 13, 2023

Codecov Report

Merging #417 (bee8aa7) into main (955019b) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #417   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           54        54           
  Lines         2089      2095    +6     
=========================================
+ Hits          2089      2095    +6     
Impacted Files Coverage Δ
numcodecs/json.py 100.00% <100.00%> (ø)
numcodecs/msgpacks.py 100.00% <100.00%> (ø)
numcodecs/tests/test_json.py 100.00% <100.00%> (ø)
numcodecs/tests/test_msgpacks.py 100.00% <100.00%> (ø)

@martindurant
Copy link
Member

+1 from me. Will merge unless someone objects.

@martindurant martindurant merged commit 4f2a2e3 into zarr-developers:main Jan 13, 2023
@martindurant
Copy link
Member

Thanks, @bnavigator

@rabernat rabernat mentioned this pull request Jan 15, 2023
7 tasks
@joshmoore
Copy link
Member

+1 from me. Will merge unless someone objects.

Still catching up after the holidays. Probably my only concern is that with v3 we probably want a more explicit handle on when object is permissible, but that's a lot longer-term than numpy 1.24 which is here now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flexible codecs cannot handle compound dtypes containing objects

4 participants