ENH ufunc called on memmap return a ndarray#7406
Conversation
|
Years ago, had started something like this, but had to be a mit more complex, because of subclasses of memmap. You can find it here: #2932 |
numpy/core/tests/test_memmap.py
Outdated
There was a problem hiding this comment.
I suspect that the test failure is because self.data starts with zero, which causes problems with division.
|
Not sure if the counts as a bug/enhancement/change in behavior. If the latter, might need to start with a |
|
Looks good, though I'm pretty sure in-place operations (eg Also, I wonder if this is an opportunity to try out the new |
7826a7e to
137afc3
Compare
|
Well, looks like Here's a possible fix for def __array_wrap__(self, arr, context=None):
if np.may_share_memory(self, arr):
return arr
if arr.shape == ():
return arr[()]
return arr.view(ndarray)so now you can also do More generally, there are 2 things I am not 100% sure of yet which we need to be more sure of:
|
|
Dunno. Don't even know what my code was compared to this (why did I Only thing I would be worried about that memmaps have some attributes About 1. Yes, we are sure ufuncs never return 0-d arrays (except for Anyway, if we are not sure about all of these things, possibly we could |
|
Don't have time to read through all of this in detail right now, but as a general note I'm +10 on making it so that memmaps don't propagate through ufuncs. I'd also vote against requiring a |
|
Should get a mention in the 1.12.0-notes. |
|
@lesteve Ping. |
137afc3 to
6d7f1aa
Compare
|
I added something in doc/release/1.12.0-notes.rst. I haven't had time to work on this in the past few days. I will try to understand better what was done in #2932. |
|
@lesteve My impression of the difference between this PR and #2932 is that #2932 tries to treat subclasses of memmap specially and handles I will admit I don't really understand a lot of #2932. It appears to be trying to take care of multiple inheritance, involving inheritance from However, I do think #2932 has the right idea in to return an ndarray, since the result is a new array rather than a view. However In summary, my impression is that we can forget about trying to fix multiple inheritance like #2932, however we should somehow take care of fancy indexing, and I also suggest adding the |
6d7f1aa to
609d19a
Compare
609d19a to
838dab9
Compare
|
I think this PR is ready to be merged as is, if there are no objections. Although I have a little bit of a feeling there may be some "unknown unknowns" here, after sitting on it I haven't though up any more cases to take care of, and I'm not too worried about the multiple subclass case. Plus, I think this is easy to undo later if there are problems, and it can get some testing if we merge. Maybe we can wait a little for anyone else to speak up, and then merge. |
numpy/core/memmap.py
Outdated
There was a problem hiding this comment.
Just to be on the safe side, it might be good to move it after the may_share_memory call? I don't think we should ever return 0-d arrays from functions calling array_wrap, but I am not perfectly sure.
There was a problem hiding this comment.
Then again, maybe it doesn't matter.
There was a problem hiding this comment.
Just to be on the safe side, it might be good to move it after the may_share_memory call?
Done.
|
If there may be problems with subclasses, I would feel better if there is a note to the mailing list. |
|
Otherwise, I trust your assessment :). |
838dab9 to
944ece3
Compare
|
☔ The latest upstream changes (presumably #7325) made this pull request unmergeable. Please resolve the merge conflicts. |
|
I am on holiday for a week with a very spotty internet connection. I'll
do that when I get back home.
|
|
@lesteve - the problem I mentioned is that in your example, |
@mhvk this PR does have the behaviour you want. Thanks for the explanations, I naively expected |
|
OK, I now see why this does work (should have seen it before): when the output is an memmap, it is |
numpy/core/memmap.py
Outdated
There was a problem hiding this comment.
So, a remaining suggestion is to replace this statement with the simpler (and faster)
# Return the object it is was given as an output and should thus preserve type
if arr is self:
return arr
There was a problem hiding this comment.
Playing a bit around, I wasn't able to find an edge case in __array_wrap__ where np.may_share_memory(self, arr) and arr is self disagreed, so I made the advised change.
If someone else can see an edge case that I missed, please complain !
0d0c8e6 to
cc8c8b6
Compare
|
Let me know, I won't be able to work on it until Monday though. |
|
All right, from a brief look, astropy code doesn't seem to subclass memmap anywhere so @mhvk's example of In that case I vote to keep things simple and ignore the subclass issue, and merge this as is. Is that all right @mhvk? Anyone in the future who wants to subclass memmap will be able to override our behavior if they want. |
|
@ahaldane - I'm half curious about what goes wrong with the inheritance |
|
@mhvk, maybe I am mistaken. I was bothered by the fact that ufuncs and fancy slices of instances of @lesteve, sorry for going back and forth on this, but could you make the and a unit test could be |
Special case for reduction functions (e.g. np.sum with axis=None) that return a numpy scalar. Keep original memmap subclasses behavior to be on the safe side.
I find that slightly bothering too. In my mind, it is very unlikely that one would want a B 0d array for the result of np.mean and for fancy slices I am a bit more split but not really convinced either. I'd be interested if someone has some kind of use case where this behaviour makes sense. I made the requested change anyway. |
cc8c8b6 to
8f27d12
Compare
@embray I agree. To fix the joblib/joblib@b68a868#diff-19004d02cec64e3d448b33565de338ebR94 I never finished that work but if people are interested I could contribute it to numpy as a saner alternative to |
|
@ogrisel Yes, that looks useful. For myself, I'm using |
|
@lesteve - for the subclass case, I do think it is best to default to returning the subclass. It is quite possible that the subclass stores information that is useful even for a 0-d array (e.g., a Overall, it seems to me this is a good addition as is, and with all the tests, etc., it seems ready to go in. |
Fair enough, thanks for the explanation! |
|
Looks good, I'll merge in an hour or two. |
|
Merged, thanks @lesteve and the many reviewers. |
|
Great, thanks a lot ! |
memmap behavior is about to change with merge of numpy/numpy#7406. Test for behavior change so we can adapt expected mmap return values on an image.
memmap behavior is about to change with merge of numpy/numpy#7406. Test for behavior change so we can adapt expected mmap return values on an image.
memmap behavior is about to change with merge of numpy/numpy#7406. Test for behavior change so we can adapt expected mmap return values on an image.
Special case for reduction functions (e.g. np.sum with axis=None) that
return a numpy scalar.
Fix #7403. Implemented
__array_wrap__following @ahaldane's suggestion. Should fix #667 as well.