BUG: Fix return shape of inverse_indices in unique_inverse#25553
BUG: Fix return shape of inverse_indices in unique_inverse#25553mhvk merged 5 commits intonumpy:mainfrom
Conversation
|
Should we also just change this in the normal version? It does look much like a bug-fix when |
|
I don't think we should make this change in
|
|
No strong opinion, although it seems so odd that I think we can do it, and no time like a 2.0 release to change such a thing that looks like an oversight to begin with. (If users worked around via |
|
Sure, I see what you're saying. I'm happy to update the PR to modify the original (and add appropriate docs/changelog) if you'd prefer that. |
mhvk
left a comment
There was a problem hiding this comment.
Good to fix this!
Like @seberg, I think we should ideally fix this in np.unique itself, since it is clearly a bug.
That said, np.unique has an axis argument, and with that the simple reshape is no longer generally correct. It would need something like the following:
arr = np.array([[[1, 2, 3], [2, 3, 1]]]*2 +[[[1, 3, 2], [2, 1, 3]]])
axis = 1 # for testing
a, i = np.unique(arr, axis=axis, return_inverse=True)
if axis is None:
i.shape = arr.shape
else:
i.shape = tuple((sh if ax == axis else 1) for ax, sh in enumerate(arr.shape))
# we probably want this too, to make the indices similar to argsort.
i = np.broadcast_to(i, arr.shape)
# check result is correct, in way also suggested for np.argsort, etc.
np.all(np.take_along_axis(a, i, axis=axis) == arr)
# all true
|
I think this change to |
|
In the triage meeting we reached a consensus to call the wrong 1d shape a bug, and fix it without any deprecation. It should be mentioned in the release notes. |
|
Thanks - I will update this PR to change the output shape for |
fbcedb6 to
6903f6c
Compare
|
OK, I changed the implementation of |
mhvk
left a comment
There was a problem hiding this comment.
Nice! Only a silly nitpick from me. Approving.
|
Note there is a linting error, so you can fix the versionchanged at the same time... |
Code was written to expect a 1-D array with the inverse indices, and that got changed in NumPy in numpy/numpy#25553. Closes scipygh-19867 [skip cirrus] [skip circle]
|
This broke SciPy in two places, in ways that were a little nontrivial to diagnose: scipy/scipy#19868. The fix does make sense, but more complaints may roll in - this probably broke most usages of the returned reverse indices. |
|
Hmm, @jakevdp was clearly right to worry about breaking things. Arguably in cases like this we should run the scipy tests... Thanks for fixing! |
|
@rgommers do you think this change should be reverted? I do think the change in the |
|
I don't know, I'd give it a week or so - if no issues turn up for other libraries that test against numpy nightlies, it's probably okay to keep the change. |
|
One more data point: this broke some code in an unreleased project of mine ( NumPy 1.26.4: NumPy 2.0.0rc2 I can easily fix the issue, and since we haven't heard of any other problems related to this, this doesn't open up again the question of reverting the change. Just FYI. |
|
Well, I thought the reason why this was different in the Array API was because the current behavior doesn't make sense... If that was the case, I would not want to revert this. But, unfortunately, that impression was seems wrong. There is no big reason for EDIT: Sorry, forgot to cross-ref gh-26738 |
There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.
|
Just FYI, this is now reverted except for |
There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.
There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.
Fixes #25552