-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Assignment of np.ma.masked to obect-type Array #9627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
phobson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. This looks good to me.
|
I'm digging into this a little deeper, btw |
|
So one things that's not clear to me is exactly what should happen. So I took a little survey of the behavior of assigning Floats
Ints
Objects
So, to me, it looks like this PR does what it aims to do. My only recommendation is that the test for this functionality go in a new test instead of being a new check in an existing test. |
|
Hi @phobson, thanks for looking at other cases that I hadn't considered in detail. I'll create a new test ... |
| x[...] = np.ma.masked | ||
| dx[...] = np.ma.masked | ||
|
|
||
| assert_eq(x.mask, da.ma.getmaskarray(dx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that if I try to do assert_eq(x, dx) it fails, but I would have expected not to.
>>> x
masked_array(data=[--, --, --],
mask=[ True, True, True],
fill_value='?',
dtype=object)
>>> dx
dask.array<setitem, shape=(3,), dtype=object, chunksize=(2,), chunktype=numpy.MaskedArray>
>>>dx.compute()
masked_array(data=[--, --, --],
mask=[ True, True, True],
fill_value='?',
dtype=object) not sure why the assert_eq(x, dx) does not work.
But with assert_eq(x.mask, da.ma.getmaskarray(dx))
we are asserting the mask is the same, but not the resulting array like in all the other tests.
@jrbourbeau Do you have any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi - I think it's because np.ma.allclose (https://github.com/dask/dask/blob/2022.10.2/dask/array/utils.py#L177)
treats masked elements as being equal to co-located non-masked elements:
>>> import numpy as np
>>> a = np.ma.arange(3)
>>> b = np.ma.arange(3)
>>> a[0] = np.ma.masked
>>> b[1] = np.ma.masked
>>> print(a)
[-- 1 2]
>>> print(b)
[0 -- 2]
>>> np.ma.allclose(a, b, masked_equal=True)
TrueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... I should also say that the test is checking that both arrays (np and dask) are masked everywhere, so unless we check the mask arrays themselves, we could get a false positive. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we should check that the arrays are the same. Hence me asking why trying assert_eq(x, dx) fails. We should find a way to check for this, ideally without having to compute the dask array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ncclementi, I'm not sure if I need to do anything, here. I think that assert_eq(x, dx) is behaving as expected, in that it mimics how numpy behaves - so it's less that it fails, rather it's not designed to answer the question we're asking. I think that the only way two wholly masked arrays can be safely compared is by comparing their masks, which assert_eq can do easily. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi again, just checking that if there's anything outstanding on this issue? It'd be nice to see it in a January release, if possible :). Thanks.
|
cc @dask/array anyone around who can review this masked array PR? |
|
Hello, Just checking in again on this, as it is still needed. I really appreciate all of the Dask maintainers' work, and if you could find the time to have another look that would be great :) Many thanks, |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ± 0 15 suites ±0 3h 23m 14s ⏱️ +24s For more details on these failures, see this check. Results for commit 6b56774. ± Comparison against base commit 17f83a4. |
phofl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, the asset_eq issue can be a follow up
|
thx |
np.ma.maskedto obect-typeArray#9626pre-commit run --all-files