BUG: core: in dot(), make copies if out has memory overlap with input#8539
BUG: core: in dot(), make copies if out has memory overlap with input#8539charris merged 4 commits intonumpy:masterfrom
Conversation
BLAS does not allow aliased inputs. If user-provided out= argument may overlap in memory with one of the inputs to dot(), put the output to a temporary work array and copy back after the operation.
| Py_INCREF(out); | ||
| ret = out; | ||
| } | ||
| Py_INCREF(out); |
There was a problem hiding this comment.
Is this needed? Looks like it has already been done.
There was a problem hiding this comment.
I don't like the variable naming, it should be more clear what the difference between ret and result is
There was a problem hiding this comment.
@charris: yes it's needed, since both variables are owned references.
@juliantaylor: agreed, would be best to change.
| Py_DECREF(ap1); | ||
| Py_DECREF(ap2); | ||
| return PyArray_Return(ret); | ||
| Py_DECREF(ret); |
There was a problem hiding this comment.
Might note that this triggers copyback if needed.
numpy/core/tests/test_multiarray.py
Outdated
| def test_dot_out_mem_overlap(self): | ||
| np.random.seed(1) | ||
|
|
||
| for dtype in [np.object_, np.float32, np.complex128, np.int64]: |
There was a problem hiding this comment.
Should this also test float64 and complex64?
There was a problem hiding this comment.
do we need multiple dtype tests at all? none of the new code has any relations to type. Doesn't matter much here as its a fast test, but I like to avoid unnecessary test bloat.
There was a problem hiding this comment.
There are two code paths --- blas and non-blas, so you need at least two dtypes.
numpy/core/tests/test_multiarray.py
Outdated
| x = np.dot(a, b, out=b) | ||
| assert_equal(x, y, err_msg=repr(dtype)) | ||
| # Test BLAS and non-BLAS code paths | ||
| for dtype in [np.float32, np.int64]: |
There was a problem hiding this comment.
Actually, I prefer the maximal tests because the code may change in the future. The tests not only check current behavior, they also define correct behavior for the future.
|
Thanks Pauli. |
|
Thank you @pv |
BLAS does not allow aliased inputs. If user-provided out= argument may
overlap in memory with one of the inputs to dot(), put the output to a
temporary work array and copy back after the operation.
Fixes gh-8440