-
-
Notifications
You must be signed in to change notification settings - Fork 12k
BUG: ma.compressed() function returns masked array #4039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy/ma/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. May be weird, but I think the subclass of the data is conserved, so this needs to check explicitely for non-masked arrays. Also, it needs a .ravel(). And I am wondering if compressed should say somewhere in the docs that it is not necessary a copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it needs a .ravel().
In this case we have one more bug:
>>> ma.compressed(ma.zeros((2,3))).shape
(2, 3)
(Checked with master and 1.7.1.)
Note that np.asanyarray() is just array(a, copy=False, subok=True). My intention was to just change subok to False, but it looks like I inadvertently changed copy to True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be weird, but I think the subclass of the data is conserved, so this needs to check
explicitly for non-masked arrays.
@seberg - I don't understand this comment. Are you saying that if x is a MaskedArray with mask set to nomask and data being a subclass of ndarray this subclass should be preserved? I am not sure how this situation is possible with the current implementation of MaskedArray as a subclass of ndarray. Can you show a test case where you think my code produces a wrong result?
I am starting to think that it is best to rewrite ma.compressed() simply as
def compressed(x):
return asarray(x).compressed()
and leave optimization of non-masked case for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what you said is what I meant. I am pretty sure that this is currently possible (even if I find it a bit weird), but I never really used them, so...
There is already an optimization for that in the compressed attribute. I believe the main purpose here is to support non-masked arrays. So anything that is not a masked array must be .ravel()ed I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, missed the first comment. Yes, but it is documented to ravel, and the attribute does so, so that it does not ravel is a bug I think.
|
@seberg: I added a commit that reimplements ma.compressed() function simply as ma.asarray(x).compressed(). I believe this is ultimately correct behavior. I also added some tests that should help validate any future optimization. It may be best to just leave this implementation because as this bug and my first attempt to fix it show, it may not be easy to optimize the non-masked case correctly and the effort is better spent optimizing asarray() or the compressed() method. |
numpy/ma/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets at least add an if isinstance(x, MaskedArray): to avoid the np.ma.asarray in that case. The conversion seems quite slow (though that special case could probably be also optimized in ma.asarray).
|
Hey, I think it should be good. Would be nice if you can squash those follow up commits into the first. |
|
Is it necessary to check |
|
Yes, should be good enough. |
When a masked array with nomask was passed to ma.compressed() function, the result was the original masked array instead of an ndarray. Closes numpygh-4026.
|
@seberg - I combined my commits in one and fixed a case when input is an instance of a MaskedArray subtype that overrides .compressed(). New code calls overriding method in this case. @Nodd - I would like to address ma.asarray() changes separately. See gh-4043. This is a much more used function than ma.compressed() and not making a copy may change behavior in some cases. At the very least gh-4043 is an ENH rather than a BUG, so it is good to keep the two issues separately. |
|
Can't say I currently care about the fact that the check might as well be only in asarray. If you like you can undo this again in the other thing (and probably other occurances as well). Anyway, I think this is correct, and doubt that it is often called on non-masked arrays (which could be optimized), so merging. |
BUG: ma.compressed() function returns masked array
When a masked array with nomask was passed to ma.compressed()
function, the result was the original masked array instead of
an ndarray.
Closes gh-4026.