BUG,ENH: Fix generic scalar infinite recursion issues #26904
BUG,ENH: Fix generic scalar infinite recursion issues #26904ngoldbaum merged 7 commits intonumpy:mainfrom
Conversation
This reorganizes the avoidance of infinite recursion to happen in the generic scalar fallback, rather than attempting to do so (badly) in the scalarmath, where it only applies to longdouble to begin with. This was also needed for the follow up to remove special handling of python scalar subclasses.
| res = PyArray_GenericBinaryFunction(m1, other_op, n_ops.power); | ||
| } | ||
| else { | ||
| res = PyArray_GenericBinaryFunction(other_op, m2, n_ops.power); |
There was a problem hiding this comment.
I started calling the ufunc directly now. This has advantages and disadvantages, but overall: I don't think it should matter much and besides working around power oddities felt pretty sane.
(Not done yet for the comparison, where it might also make sense eventually)
|
I don't care for backporting this. The bug has been around forever. |
| # As of NumPy 2.1, longdouble behaves like other types and can coerce | ||
| # e.g. lists. (Not necessarily better, but consistent.) | ||
| assert_array_equal(op(sctype(3), [1, 2]), op(3, np.array([1, 2]))) | ||
| assert_array_equal(op([1, 2], sctype(3)), op(np.array([1, 2]), 3)) |
There was a problem hiding this comment.
So yes, the new paths aligns longdouble with others here. I could guess at this not being right, that would remove the array conversion step.
(It may mean deciding that one test in gh-26905, just can't pass, because we simply cannot deal with subclasses of Python floats even if they get converted to a float array later. Although, it can go either way, since you can still convert but only allow the was_scalar path.)
ngoldbaum
left a comment
There was a problem hiding this comment.
I don't have a lot of context for the details here, but I think removing special cases for longdoubles makes a lot of sense, and the logic behind the retrying for object arrays is explained in a lot of detail.
Sorry for taking so long to look at this.
| /* self_item can be used to retry the operation */ | ||
| *self_op = self_item; | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
don't you leak self_item if you get to this last return 0? Maybe also in the case where you assign it to self_op, I haven't looked at how reference counting works in the caller.
| * Generally, we try dropping through to the array path here, | ||
| * but this can lead to infinite recursions for (c)longdouble. | ||
| * We drop through to the generic path here which checks for the | ||
| * (c)longdouble infinite recursion problem. |
There was a problem hiding this comment.
Might make sense to just refer to this as the "infinite recursion problem", without special casing longdouble anymore. Maybe also leave a gh-XXXX reference.
There was a problem hiding this comment.
For this file only longouble is problematic, but fair, removed it and added some references.
numpy/_core/tests/test_scalarmath.py
Outdated
| @given(sampled_from(objecty_things), | ||
| sampled_from(reasonable_operators_for_scalars), | ||
| sampled_from(types)) | ||
| sampled_from(reasonable_operators_for_scalars + [operator.xor]), |
There was a problem hiding this comment.
Does stuff break if you add xor to reasonable_operators_for_scalars? If not i'd just add it there. If so, maybe a comment explaining why only xor here?
There was a problem hiding this comment.
I'll think a bit if I can think of a change. I think that would be a rename of the variable and not a comment here.
Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>
|
We chatted about this at the triage meeting and I got some extra context about why we're doing this. |
|
I think this broke a use case in pandas where scalar arithmetic should have delegated to In [3]: class Array:
...: def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
...: return 1
...:
In [4]: np.timedelta64(120,'m') + Array()
Out[4]: 1
In [5]: np.__version__
Out[5]: '1.26.4'In [3]: class Array:
...: def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
...: return 1
...:
In [4]: np.timedelta64(120,'m') + Array()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[4], line 1
----> 1 np.timedelta64(120,'m') + Array()
TypeError: unsupported operand type(s) for +: 'datetime.timedelta' and 'Array'
In [5]: np.__version__
Out[5]: '2.1.0.dev0+git20240802.0469e1d' |
|
Hah, yes it defers still, but if you don't implement |
|
OK, there are two problems, here (the second is even more subtle I guess):
|
|
OK, gh-27117 should fix 1. I looked at 2. a bit closer and I think it's not worth worrying about it, because the only way one could think it matters is for things that have the exact scalar array-priority, which is weird. |
|
At least from the pandas side, issue 2. shouldn't be significant since there's only one use of |
This reorganizes the avoidance of infinite recursion to happen
in the generic scalar fallback, rather than attempting to do so
(badly) in the scalarmath, where it only applies to longdouble
to begin with.
This was also needed for the follow up to remove special handling
of python scalar subclasses.
Closes gh-26767