ENH: add subok flag to stride_tricks (and thus broadcast_arrays)#4622
ENH: add subok flag to stride_tricks (and thus broadcast_arrays)#4622seberg merged 2 commits intonumpy:masterfrom
Conversation
|
I like this idea in general. I am wondering if it is too much noise, but in general it seems a good idea if we (or those that could make use of it like astropy ;)), add that keyword to other functions as well. |
numpy/lib/stride_tricks.py
Outdated
There was a problem hiding this comment.
We don't know that the input actually is an array subclass. Also we could use __array_wrap__ which might add support for things like the pandas stuff which are not arrays at all. I think that may be better, but I did not think about it in depth. Other then that, I would prefer if you write type=x.__class__ just to be clear (as opposed to dtype).
There was a problem hiding this comment.
OK, will change to type=. I thought about __array_wrap__, but since what broadcast_arrays does is very similar to what happens with, say, reshape, just handling this as if it is a new view seemed most appropriate.
There was a problem hiding this comment.
What I meant with not being an array subclass is that this must also work when x is for example a list. If we check for __array_attribute__ that is fine though. If we want to add this to a lot of functions, we really should add more helpers maybe. There already is a np.get_wrap or similar helper I think.
|
@seberg, yes, it would be useful in other functions as well, the various |
|
Not sure about that ;) |
|
Am much amused by your example! According to the documentation, |
|
Not sure, but the default |
numpy/lib/stride_tricks.py
Outdated
There was a problem hiding this comment.
Sorry, but still got to figure this out exactly first (including me). view already calls array finalize, so no need to finalize after the view. However, wrapping might still be better since it allows array likes such as pandas things to work. Double checking, it seems to me that reshape finalizes if we got a subclass and wraps otherwise, but wrapping should be more general. Still thinking :)
|
@seberg - carrying on in the main thread so we don't loose the discussion once I push the next installment... My sense, though, is that we're making it slightly too hard: p.s. The travis error seems spurious (a build hanging). |
|
Updated a comment; travis now passes. |
|
@seberg - did you have further comments on this? |
|
Well, still not sure if I don't prefer wrap. We have precedence of wrapping for example in |
|
@seberg - looking at the documentation for That said, I can see the case for Of course, in principle, we could behave more closely similar to [1] http://docs.scipy.org/doc/numpy/reference/arrays.classes.html |
|
@seberg - what shall we do with this one? It would be nice to reach some conclusion, so I can also provide this in astropy/astropy#2327 |
|
I still prefer using array wrap a little, but if we view it as an array creation function I am fine with either really. At some point it would be nice to move this into C in any case, but just wishing ;) |
|
Hmmm, array finalize is probably good. But now I am wondering if it isn't cleaner to call |
37e4a9d to
2d90fe6
Compare
|
OK, I rebased. @seberg - I'm not quite sure where in |
2d90fe6 to
26a02cd
Compare
numpy/lib/stride_tricks.py
Outdated
There was a problem hiding this comment.
What I meant was this. __array_interface__ is basically just one way to define an "array-like". So this function currently does not support most array likes out there at all (for example memoryviews, etc.), since they implement the buffer interface instead. To really support this, I think we should (first thing), call:
x = np.asanyarray(x)
here. After you have done that, __array_finalize__ is guaranteed to be defined. We will guarantee to return an array and not some array-like, which can be seen as an advantage or disadvantage. But I think I am fine with returning always an array (or array subclass here). On the other hand, the pandas guys might disagree?
There was a problem hiding this comment.
Hmmm, pandas implements array wrap but not array finalize. I kind of think it is fine to say that as_strided will always return an array, like np.asanyarray, but honestly this is a jungle, so if someone disagrees....
There was a problem hiding this comment.
@seberg - I guess adding x = np.asanyarray(x) deals with a somewhat separate issue, but obviously it is good to ensure the code is as general as possible, and this would allow one at least to remove the isinstance(x, np.ndarray) stanza from the if statement below.
I'll do that, but also a question: since with this change, we know that x is an ndarray subclass, might this also allow one to change the strides and shape more easily than through the DummyArray mechanism, so that the dtype and subclass issues get handled more elegantly as well?
There was a problem hiding this comment.
From C, you could do that very easily, but from python there is no other way really.
There was a problem hiding this comment.
And yeah, you are right, it deals with a seperate issue... I somewhat thought it might make this one simpler, too, but that doesn't change much.
There was a problem hiding this comment.
Oh, actually if you really hate DummyArray, you can also use the np.ndarray class constructor. You still will have to manually call finalize, etc. since that does not allow to set a parent.
|
Sorry for the delays. Just need to figure out the |
|
@seberg - OK, I made the changes, but in a slightly different way than you suggested, so please have a look. |
3a6c849 to
2b9064d
Compare
numpy/lib/stride_tricks.py
Outdated
There was a problem hiding this comment.
Since x is a subclass now, I think you don't need the try/except at all.
There was a problem hiding this comment.
Actually, I did the wrong test: if the subclass does not define __array_finalize__, then it is None and hence not callable. I now replaced it with an if statement.
2b9064d to
e859031
Compare
|
Thanks for the comment; addressed, and a test added to ensure subclasses without |
|
On It would allow one to remove the check on Do you think I should still do this here, or would this be better for a different PR, since it is getting a bit off-topic? (In princple, I can also back out the |
|
Actually, have to check what/if this does anynthing to object arrays (or void arrays with object fields). |
|
@seberg - I had almost started adding tests for record types at least, but felt too lazy... But can do so, as it would be good to improve the test coverage. For object arrays, would a test like the following be OK? Could you give me an example of "void arrays with object fields"? |
|
Actually, I am surprised it isn't an error :), it certainly is good, just need to see if it changed. |
|
@seberg - I came back to this again, and while a test with an object dtype works, a complex one does not in current master: This error disappears if instead of the assignment suggested above. However, this does not work for non-contiguous data and thus fails one of the tests. Since the void arrays with object fields fail in current master as well, however, I think this is not really relevant for the current PR, which just aims to get subclasses to work properly. Maybe we can agree that this is OK as is, and raise an issue for the above? |
|
We don't need that kind of error here. Since we don't mess with the dtype and the original array is in charge of doing handling reference counting, it is always safe to do. As long as we have less errors then before things are good. It fails a test which originally did not exist is what you mean and which never worked? In that case I guess we can call it a minor issue that can be fixed another time. |
|
Yeah, should be good for merging. Won't do it right now, but if I forget about it in the next couple of days, just ping me. Thanks for putting up with all my small worries :). |
|
@seberg - the promised ping to remind you to merge this! |
|
Ack :), lets put it in then. At some point we should maybe just move this whole stuff to C... |
ENH: add subok flag to stride_tricks (and thus broadcast_arrays)
|
Thanks for the patience :) |
As is,
np.broadcast_arraysdoes not preserve subclasses, even though for the shape manipulation subclasses should not be a problem. This PR adds asubokoption that allows one to change that, but isFalseby default, so that existing code will continue to behave as is.