ENH: Improved performance of PyArray_FromAny for sequences of array-like#13399
ENH: Improved performance of PyArray_FromAny for sequences of array-like#13399mattip merged 1 commit intonumpy:masterfrom
Conversation
5cf5ee7 to
edcb829
Compare
edcb829 to
0187811
Compare
|
Running shows a marked improvement. Before After |
numpy/core/src/multiarray/ctors.c
Outdated
There was a problem hiding this comment.
For reviewers: the original code had a check clause at this point if (tmp != Py_NotImplemented){ } which has been folded into the logic below instead
|
LGTM. Maybe could add a benchmark to |
|
I worry that this needs to handle broken ctypes buffer formats in the same way we do elsewhere. Will try to look at the diff later |
0187811 to
1b57708
Compare
Great, but that should be a follow-on PR, not part of this one, unless it is a one-line fix |
|
@eric-wieser since this PR reuses existing code, could we push off possible broken ctypes PEP 3118 handling to a different issue/PR? This seems like a clear win for more common memoryview and |
1b57708 to
9f8826a
Compare
|
@mattip: At a cursory glance, I was worried this introduced a regression in the following case: Testing seems to show that the error is produced both before and after this patch, so there's no need to worry about it here. |
9f8826a to
2363084
Compare
Prior to this commit np.array([array_like]) would recursively copy each element of array_like. This is due to the fact that setArrayFromSequence only special-cased lists of NumPy arrays, any other object was treated as a sequence even if it supported buffer or __array*__ interfaces. See tensorflow/tensorflow#27692 for details. The commit generalizes the special-case in setArrayFromSequence to any array-like, i.e. a buffer or an object with __array__, __array_interface__ __array_struct__.
2363084 to
71fc59d
Compare
|
I created issue #13628 about This looks ready to merge, any more comments? |
|
Thanks @superbobry |
|
This seems to have broken pandas, where |
Fixup of gh-13399 to avoid a regression with pandas. The actual output after this is the same as before, but incorrect with respect to the output shape. However, it seems a bit larger and an issue to look at after the 1.17 release. * BUG: test, fix regression for array([pandas.DataFrame()]) * BUG: refactor to make sure tmp is DECREFfed
…rray-like" This reverts commit 71fc59d which was part of numpygh-13399 and the followup commit ba53a63 from numpygh-13663. The issue is that we mix Sequence protocol and `__array__` usage when coercing arrays. It seems easiest to revert this for the 1.17.x release and make a larger breaking change in the 1.18.x release cycle. Since this only occurs for stranger array-likes. This revert is limited to the 1.17.x branch for now.
…rray-like" This reverts commit 71fc59d which was part of numpygh-13399 and the followup commit ba53a63 from numpygh-13663. The issue is that we mix Sequence protocol and `__array__` usage when coercing arrays. It seems easiest to revert this for the 1.17.x release and make a larger breaking change in the 1.18.x release cycle. Since this only occurs for stranger array-likes. This revert is limited to the 1.17.x branch for now.
…rray-like" This reverts commit 71fc59d which was part of numpygh-13399 and the followup commit ba53a63 from numpygh-13663. The issue is that we mix Sequence protocol and `__array__` usage when coercing arrays. It seems easiest to revert this for the 1.18.x release and make a larger breaking change in the 1.19.x release cycle. Since this only occurs for stranger array-likes.
Prior to this commit
would recursively copy each element of
array_like. This is due to the fact thatsetArrayFromSequenceonly special-cased lists of NumPy arrays, any otherobject was treated as a sequence even if it supported buffer or
__array*__interfaces. See tensorflow/tensorflow#27692 for details.
The commit generalizes the special-case in
setArrayFromSequenceto anyarray-like, i.e. a buffer or an object with
__array__,__array_interface____array_struct__.