fix: allow numpy integers and 0-dim arrays through shape_item_as_index to support numpy 2.3#3531
fix: allow numpy integers and 0-dim arrays through shape_item_as_index to support numpy 2.3#3531
shape_item_as_index to support numpy 2.3#3531Conversation
|
Though now this technically violates the typing of |
|
I think the typing of Edit: actually...I think we do |
There was a problem hiding this comment.
I think I'd prefer:
import operator
from awkward._regularize import is_integer_like
class ArrayModuleNumpyLike:
...
def shape_item_as_index(self, x1: ShapeItem) -> int:
if x1 is unknown_length:
raise TypeError("array module nplikes do not support unknown lengths")
elif is_integer_like(x1):
return operator.index(x1)
else:
raise TypeError(f"expected None or int type, received {x1}")This should be working automatically for all relevant types.
This change (likely) needs to be propagated to the TypeTracer nplike as well: https://github.com/scikit-hep/awkward/blob/main/src/awkward/_nplikes/typetracer.py#L1040-L1046 to:
import operator
from awkward._regularize import is_integer_like
class TypeTracer:
...
def shape_item_as_index(self, x1: ShapeItem) -> IndexType:
if x1 is unknown_length:
return TypeTracerArray._new(np.dtype(np.int64), shape=())
elif is_integer_like(x1):
return operator.index(x1)
else:
raise TypeError(f"expected unknown_length or int type, received {x1}")Could you give this a try?
|
Just for context, this happens because in numpy 2.3 If the point of Maybe In the case of this error, awkward/src/awkward/contents/indexedoptionarray.py Lines 1730 to 1734 in 8e2d4a8 |
|
@pfackeldey
|
|
Yeah getting that to not have to spend time copying a single int back to host would be nice. That's relatively extremely expensive. |
Then, there should be a different dedicated |
|
While I'm not sure this fix is appropriate, it does fix this specific problem. I'll get to making a reproducer today so this can be more correctly managed. |
shape_item_as_index
|
@lgray @pfackeldey As discussed offline, I've pushed the fix that covers python ints, numpy ints, dim-0 arrays and should work with cupy as well. @lgray could you check that this makes the fastjet test pass? This is a quick fix for numpy 2.3 but we should rethink what a |
|
@ianna I could add a test as well if you'd like? Or we can wait to revisit shape item and test then? Here is the simplest reproducer of the issue with numpy 2.3 import awkward as ak
import numpy as np
content = ak.contents.NumpyArray(np.array([0.0, 1.1, 2.2, 3.3, 4.4]))
index = ak.index.Index64(np.array([2, 2, 0, -1, 4], dtype=np.int64))
array = ak.contents.IndexedOptionArray(index, content)
array.to_packed() |
shape_item_as_indexshape_item_as_index to support numpy 2.3
|
The failures are one single test coming from the latest numexpr release. It's coming from this line: awkward/src/awkward/_connect/numexpr.py Line 82 in 8e2d4a8 numexpr.necompiler._names_cache is no longer a CacheDict but a threading.local() due to this commit: pydata/numexpr@33ee71b
I'll think tomorrow probably how we can deal with this. |
Hey — since this one’s already greenlit, let’s keep it as-is and handle any tweaks in a follow-up. Keeps things tidy for the history. Thanks a ton! |
ianna
left a comment
There was a problem hiding this comment.
If this is meant to be a real fix, it needs a reproducer, tests, and GPU profiling — else let’s roll with the approved quick fix.
|
@ianna which version are you talking about? The current version of the PR is the appropriate quick fix that works with cuda and the typetracer backend. The original version that you green-lighted won't work with those two backends that's why we had a very short offline chat with Lindsey and Peter and put the current version of the PR as the "quick fix". The “lengthier” fix would be implementing a separate function for the cupy backend and in general re thinking a bit how shape item works. I’m happy to put a test in this one. Without this fix, there are about 400 test failures though with numpy 2.3 they get noticed in the other NumExpr PR I have up. |
Thanks for the context — but to be clear, the initial quick fix addressed a specific, documented problem that was described in an issue. This new 'improved fix' currently doesn’t have an associated issue or a reproducer for the backend-specific problems it’s trying to address. If there’s a deeper problem affecting CUDA and Typetracer, it really needs to be properly tracked in a new issue, with reproducers attached. Otherwise we risk merging untracked changes that solve symptoms without addressing or even documenting the underlying cause. I’d prefer we hold off merging this version until that’s in place. |
|
Well for the cpu backend, I have a reproducer already which I can add as a test. For cuda, it's not tested in ci but I can create a reproducer which shows why the "original" fix fails and you can try it offline. For the typetracer, I'm not 100% sure if we can end up with this failure in actual code, but since we're allowing "integer like" objects in this function in the cpu backend, we might as well allow them in the typetracer case. |
|
After I tracked down the use cases of |
|
Superseded by #3534 |
fixes #3530 (but we'll see what the tests say)