-
-
Notifications
You must be signed in to change notification settings - Fork 2k
A general Masked class for Quantity and other ndarray subclasses #11127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d506726 to
a5e3e42
Compare
|
@taldcroft - I added brief documentation as well as more comments and docstrings to the code. I still need to think a bit about the auto-generation with |
taldcroft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhvk - thanks, the docs are a good place to start some API discussions, and a gentle introduction to reviewing this. :-)
|
Question - will this be able to support setting the mask via |
|
Question 2 - what's your sense of how much it would take to apply this to a non-ndarray class like |
|
@taldcroft - I updated the documentation page with your suggestions, and also made two changes that you had suggested elsewhere:
Both were fairly easy and will extend without problem to container classes like I'm trying to think what point is best to stop for this PR. One option would be about where it is, i.e., with a PR that does not yet touch anything else. Another option might be to see to what extent the new I'm also interested in experimenting with |
|
Just for amusement, some things already work better than I had thought possible: Latter is because display somehow loses the mask... Also, propagation will need better support of gufuncs: |
|
@mhvk - amazing progress. I need to find a block of time to give this a deeper review, but to be honest I expect that to be more functional than deep line-by-line of all the new stuff. Overall I agree this may be a decent waypoint to get this into master in order to facilitate subsequent smaller, more manageable, PRs. Just one comment which may disappoint you, but I'm not a huge fan of the strikethrough. There is this unresolvable philosophical schism of whether the data are "masked" or "invalid/missing". From the database perspective it is typically the latter. If you do a table join and there are missing elements, the data behind the mask is junk and should not be displayed. If you want a configuration option to use strikethrough that would be OK, but overall I'd prefer to not rock the boat quite so much and just stick with |
|
OK, sounds good! Right now, in a separate branch where I've been trying, it is definitely only an option to use strikethrough - indeed, beyond your good point that usually you don't actually want to see what is behind the mask, it seems sphinx/.rst does not support strikethrough so it will be difficult to document it! I don't think it needs to hold up the PR, but what I still do hope to ensure is that all erfa ufunc work (right now some of the gufunc among them do not - one needs to infer what axis is being acted on so one act on the mask similarly; not super-difficult, but a bit tricky and tedious). I'm really hopeful, though, that the follow-up PR for containers like |
|
Just made a few further updates to ensure that representations, coordinates and times can propagate masks if they are present (for |
|
@mhvk - just to let you know that I've been a bit slammed by day-job stuff but this is still on my radar. Holidays are coming! |
|
@taldcroft - no worries if this takes some time... it is a very big PR. In the meantime, I added support for structured dtype, since that is needed for I also have branch that uses this internally in |
3cce5b7 to
67ab884
Compare
67ab884 to
53c06ff
Compare
|
Rebased and updated with substantially increased coverage of numpy functions. Somewhat better documentation of functions where behaviour is non-obvious. |
Agreed. I think it would be confusing / fragile to do otherwise. |
Also no longer allow a default argument for fill_value, to avoid people assuming there is a value kept on the class like is the case for MaskedArray.
2801218 to
0389951
Compare
| utils: | ||
| - cextern/expat/**/* | ||
| - any: ['**/utils/**/*', '!astropy/utils/iers/**/*', '!docs/utils/iers.rst'] | ||
| - any: ['**/utils/**/*', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pllim - do these additional changes to the labeller make sense? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's correct.
p.s. 100 commits and +5,427 −112 😱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was quite a bit of work. Though most of the lines is somewhat boilerplate in function_helpers and lots of tests...
|
I vote for 4.3 based on precedent from modeling. Essentially all of this is in a new sub-subpackage and it has no impact on existing code. At least that is my understanding. Getting more testing on this from a broad audience will be crucial to flushing out issues and potentially API details. |
|
For 5.0 we should have |
|
OK, sounds good! Since #11210 is in, and it seems the rebase was OK too, I'll merge, even though I am sure that if I went over the code again, I'd find some things to improve. But let that be follow-up that is more reviewable! Thanks for the reviews, @taldcroft and @nstarman. |
|
I think this definitely needs a What's New entry. 😉 |
|
🎉 💥 |
|
I think this broke |
|
Darn. But that's an easy fix at least (tells you how long ago this PR started, that I still had relative imports...). |
This commit removes documentation that has been made obsolete by pull requests astropy#9340 and astropy#11127.


Split off from #10423, to include only regular arrays and Quantity and its subclasses. From #10423, the general comments still hold:
It may be worth pointing out two specific things:
MaskedArray- any time there is any items in the reduction masked, the result is masked as well. So, e.g., for.sum(), masked elements are not implicitly replaced by zero. The logic is that one cannot interpret the sum meaningfully without knowing how many elements actually contributed. Conversely, following that same logic, for methods where skipping a masked element does make sense (like.mean(), where one divides by the number of unmasked elements), I do use it.Currently, the implementation does not support structured arrays. The question for those, which I haven't really found a "best" answer for, is whether the mask should be per full item, or whether the mask itself should also be a structured array, i.e., allow masking individual pieces.EDIT: followedMaskedArraydoes the latter and it is quite a hassle to keep track, but perhaps better.MaskedArrayand have a structured mask.This is definitely at a point where some further design choices have to be made, and it would be good to step back a bit and decide what exactly we want masking to mean.
Part of the difficulty is that it cannot immediately used for
MaskedColumn, since some design choices which are different fromMaskedArray, so it would need a kind of compatibility layer if we want to keep exact one-to-one matches, and/or some way to decide which type one wants.fixes #1852, #1857, #7367, #10792