BUG: Fix return shape of inverse_indices in unique_inverse by jakevdp · Pull Request #25553 · numpy/numpy

jakevdp · 2024-01-08T19:46:58Z

seberg · 2024-01-08T20:21:45Z

Should we also just change this in the normal version? It does look much like a bug-fix when axis is used?

jakevdp · 2024-01-08T20:29:02Z

I don't think we should make this change in np.unique, because that's a potentially breaking change.

unique_all and unique_inverse have not been part of a release yet, so modifying their output shapes is safer.

seberg · 2024-01-08T20:40:27Z

No strong opinion, although it seems so odd that I think we can do it, and no time like a 2.0 release to change such a thing that looks like an oversight to begin with. (If users worked around via .reshape(input.shape) their code even keeps working and res.reshape(-1) gives the old without branching)

jakevdp · 2024-01-08T21:07:30Z

Sure, I see what you're saying. I'm happy to update the PR to modify the original (and add appropriate docs/changelog) if you'd prefer that.

mhvk

Good to fix this!

Like @seberg, I think we should ideally fix this in np.unique itself, since it is clearly a bug.

That said, np.unique has an axis argument, and with that the simple reshape is no longer generally correct. It would need something like the following:

arr = np.array([[[1, 2, 3], [2, 3, 1]]]*2 +[[[1, 3, 2], [2, 1, 3]]])
axis = 1  # for testing
a, i = np.unique(arr, axis=axis, return_inverse=True)
if axis is None:
    i.shape = arr.shape
else:
    i.shape = tuple((sh if ax == axis else 1) for ax, sh in enumerate(arr.shape))
    # we probably want this too, to make the indices similar to argsort.
    i = np.broadcast_to(i, arr.shape)

# check result is correct, in way also suggested for np.argsort, etc.
np.all(np.take_along_axis(a, i, axis=axis) == arr)
# all true

numpy/lib/tests/test_arraysetops.py

jakevdp · 2024-01-09T00:11:20Z

I think this change to np.unique makes sense, but maybe it deserves its own discussion? Should we merge this PR (so that unique_* have the correct API) and then discuss np.unique in its own issue and/or mailing list thread?

mattip · 2024-01-10T18:36:34Z

In the triage meeting we reached a consensus to call the wrong 1d shape a bug, and fix it without any deprecation. It should be mentioned in the release notes.

jakevdp · 2024-01-10T19:03:39Z

Thanks - I will update this PR to change the output shape for np.unique as well.

jakevdp · 2024-01-10T19:56:51Z

OK, I changed the implementation of np.unique itself, added release notes, and updated several tests. PTAL!

mhvk

Nice! Only a silly nitpick from me. Approving.

numpy/lib/_arraysetops_impl.py

mhvk · 2024-01-10T21:08:49Z

Note there is a linting error, so you can fix the versionchanged at the same time...

mhvk

Thanks again!

Code was written to expect a 1-D array with the inverse indices, and that got changed in NumPy in numpy/numpy#25553. Closes scipygh-19867 [skip cirrus] [skip circle]

rgommers · 2024-01-11T21:46:11Z

This broke SciPy in two places, in ways that were a little nontrivial to diagnose: scipy/scipy#19868. The fix does make sense, but more complaints may roll in - this probably broke most usages of the returned reverse indices.

mhvk · 2024-01-11T22:19:03Z

Hmm, @jakevdp was clearly right to worry about breaking things. Arguably in cases like this we should run the scipy tests... Thanks for fixing!

jakevdp · 2024-01-12T00:24:09Z

@rgommers do you think this change should be reverted? I do think the change in the axis=None case is a good one, but I don't feel as strongly about the change of behavior when axis is an int.

rgommers · 2024-01-12T13:50:48Z

I don't know, I'd give it a week or so - if no issues turn up for other libraries that test against numpy nightlies, it's probably okay to keep the change.

WarrenWeckesser · 2024-05-22T22:56:08Z

One more data point: this broke some code in an unreleased project of mine (yanova). Some tests fail because of the change of shape of the inverse array in examples like the following.

NumPy 1.26.4:

In [9]: np.__version__
Out[9]: '1.26.4'

In [10]: m = np.array([[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2],
    ...:               [0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0]])

In [11]: np.unique(m, axis=1, return_inverse=True)
Out[11]: 
(array([[0, 0, 1, 1, 2, 2],
        [0, 1, 0, 1, 0, 1]]),
 array([0, 0, 0, 1, 1, 1, 3, 3, 2, 2, 5, 5, 5, 4, 4]))

NumPy 2.0.0rc2

In [6]: np.__version__
Out[6]: '2.0.0rc2'

In [7]: m = np.array([[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2],
   ...:               [0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0]])

In [8]: np.unique(m, axis=1, return_inverse=True)
Out[8]: 
(array([[0, 0, 1, 1, 2, 2],
        [0, 1, 0, 1, 0, 1]]),
 array([[0, 0, 0, 1, 1, 1, 3, 3, 2, 2, 5, 5, 5, 4, 4]]))

I can easily fix the issue, and since we haven't heard of any other problems related to this, this doesn't open up again the question of reverting the change. Just FYI.

seberg · 2024-06-18T10:02:02Z

Well, I thought the reason why this was different in the Array API was because the current behavior doesn't make sense...
... so that is a bit my bad, I didn't think much about it assuming that the Array API decision was base on the current version being unwieldy.

If that was the case, I would not want to revert this. But, unfortunately, that impression was seems wrong. There is no big reason for unique to behave differently, because unique is already compatible to np.take rather than np.take_along_axes, and that isn't going away.

EDIT: Sorry, forgot to cross-ref gh-26738

There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.

seberg · 2024-07-13T16:13:25Z

Just FYI, this is now reverted except for axis=None where the change made sense. For axis=0, etc. I think the old behavior was actually strictly better anyway.

There was a good argument that it is not possible to reconstruct the original array with `axis=None` without first reshaping and changing the result shape helped with it. However, it was always possible to do it for other axis values by using `np.take` rather than `np.take_along_axis`. Changing it for all axis values is unnecessary to achieve reconstruction because `np.take(arr, inverse, axis=axis)` already performed the job except for `axis=None`. Thus, this keeps the change for axis=None, but reverts numpygh-25553 for numerical axis.

mhvk reviewed Jan 8, 2024

View reviewed changes

numpy/lib/tests/test_arraysetops.py Show resolved Hide resolved

jakevdp mentioned this pull request Jan 8, 2024

Tracking issue: NumPy 2.0 Compatibility jax-ml/jax#19246

Closed

18 tasks

charris changed the title ~~unique_inverse: return appropriate shape for inverse_indices~~ BUG: Fix return shape of inverse_indices in unique_inverse Jan 10, 2024

ngoldbaum added the triage review Issue/PR to be discussed at the next triage meeting label Jan 10, 2024

mattip added triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels Jan 10, 2024

jakevdp added 3 commits January 10, 2024 11:04

unique_inverse: return appropriate shape for inverse_indices

bfbdf98

add additional unique_inverse test

01be917

BUG: reshape inverse_values output for multi-dimensional np.unique

6903f6c

jakevdp force-pushed the unique-inverse branch from fbcedb6 to 6903f6c Compare January 10, 2024 19:56

mhvk approved these changes Jan 10, 2024

View reviewed changes

numpy/lib/_arraysetops_impl.py Outdated Show resolved Hide resolved

jakevdp added 2 commits January 10, 2024 13:13

fix versionchanged tag

4650167

fix line too long

120a77b

mhvk approved these changes Jan 11, 2024

View reviewed changes

mhvk added the 00 - Bug label Jan 11, 2024

mhvk added this to the 2.0.0 release milestone Jan 11, 2024

mhvk added the component: numpy.lib label Jan 11, 2024

mhvk merged commit 85f9f56 into numpy:main Jan 11, 2024

jakevdp deleted the unique-inverse branch January 11, 2024 00:54

jakevdp mentioned this pull request Jan 11, 2024

jnp.unique: make return_inverse shape match NumPy 2.0 jax-ml/jax#19320

Merged

rgommers mentioned this pull request Jan 11, 2024

New ndimage and RBFInterpolator test failures in pre-release CI job scipy/scipy#19867

Closed

rgommers mentioned this pull request Jan 11, 2024

MAINT: fix use of unique(..., return_inverse=True) scipy/scipy#19868

Merged

rgommers mentioned this pull request Jan 11, 2024

Clarify flattening behavior and fix required output shape for inverse indices in unique_* APIs data-apis/array-api#700

Closed

lesteve mentioned this pull request Jan 17, 2024

MAINT fix tests by passing 1d vector to unique scikit-learn/scikit-learn#28137

Merged

This was referenced Jan 18, 2024

Add development CI for NumPy 2.0 beta testing pyvista/pyvista#5450

Merged

numpy 2.0 fix for np.unique pyvista/pyvista#5493

Merged

honno mentioned this pull request Jan 22, 2024

ENH: Inverse indices from np.unique() sharing the input array's shape #20638

Closed

polsys mentioned this pull request Apr 3, 2024

Add NumPy 2.0 to testing polsys/ennemi#120

Merged

andyfaff mentioned this pull request May 1, 2024

DOC: fix np.unique release notes [skip cirrus] #26369

Merged

charris mentioned this pull request May 2, 2024

DOC: fix np.unique release notes [skip cirrus] #26373

Merged

seberg added the Numpy 2.0 API Changes label May 23, 2024

jakevdp mentioned this pull request Jun 21, 2024

BUG: np.unique with return_inverse and axis specification yields a wrong shape #26738

Closed

ylep mentioned this pull request Jul 2, 2024

compressed_segmentation encoder generates invalid data under NumPy 2.0 HumanBrainProject/neuroglancer-scripts#39

Closed

seberg mentioned this pull request Jul 11, 2024

API: Partially revert unique with return_inverse #26914

Merged

charris mentioned this pull request Jul 16, 2024

API: Partially revert unique with return_inverse #26961

Merged

Uh oh!

Conversation

jakevdp commented Jan 8, 2024

Uh oh!

seberg commented Jan 8, 2024

Uh oh!

jakevdp commented Jan 8, 2024

Uh oh!

seberg commented Jan 8, 2024

Uh oh!

jakevdp commented Jan 8, 2024

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jakevdp commented Jan 9, 2024

Uh oh!

mattip commented Jan 10, 2024

Uh oh!

jakevdp commented Jan 10, 2024

Uh oh!

jakevdp commented Jan 10, 2024

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mhvk commented Jan 10, 2024

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

rgommers commented Jan 11, 2024

Uh oh!

mhvk commented Jan 11, 2024

Uh oh!

jakevdp commented Jan 12, 2024

Uh oh!

rgommers commented Jan 12, 2024

Uh oh!

WarrenWeckesser commented May 22, 2024

Uh oh!

seberg commented Jun 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Jul 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

seberg commented Jun 18, 2024 •

edited

Loading