BUG, SIMD: Fix memory overlap in ufunc comparison loops by seiko2plus · Pull Request #22851 · numpy/numpy

seiko2plus · 2022-12-21T10:01:44Z

seberg · 2022-12-21T13:33:35Z

Isn't the problem that the IS_BLOCKABLE_BINARY_BOOL check is missing the (abs_ptrdiff(args[2], args[1]) >= (vsize) and similar checks that IS_BLOCKABLE_BINARY_SCALAR1 and IS_BLOCKABLE_BINARY_SCALAR2 have?

I guess this is fine as well, the more precise check probably doesn't matter in practice, but should we remove the macros if we don't expand the IS_BLOCKABLE_BINARY_BOOL to include the vsize based check?

seberg · 2022-12-21T13:38:42Z

This is confusing with the additional _BOOL versions! How about removing the _BOOL macros and using the normal macros instead? Maybe I am missing something, but it seems to me that should work?

seberg · 2022-12-21T13:55:10Z

Arg, sorry that was off, since the bool version deals with the output type being different. But, I do think something like this would also work:

diff --git a/numpy/core/src/umath/fast_loop_macros.h b/numpy/core/src/umath/fast_loop_macros.h
index 5274957f7..973e47850 100644
--- a/numpy/core/src/umath/fast_loop_macros.h
+++ b/numpy/core/src/umath/fast_loop_macros.h
@@ -381,15 +381,25 @@ abs_ptrdiff(char *a, char *b)
 #define IS_BLOCKABLE_BINARY_BOOL(esize, vsize) \
     (steps[0] == (esize) && steps[0] == steps[1] && steps[2] == (1) && \
      npy_is_aligned(args[1], (esize)) && \
-     npy_is_aligned(args[0], (esize)))
+     npy_is_aligned(args[0], (esize)) && \
+     (abs_ptrdiff(args[2], args[0]) >= (vsize) || \
+      abs_ptrdiff(args[2], args[0]) == 0) && \
+     (abs_ptrdiff(args[2], args[1]) >= (vsize) || \
+      abs_ptrdiff(args[2], args[1]) >= 0))
 
 #define IS_BLOCKABLE_BINARY_SCALAR1_BOOL(esize, vsize) \
     (steps[0] == 0 && steps[1] == (esize) && steps[2] == (1) && \
-     npy_is_aligned(args[1], (esize)))
+     npy_is_aligned(args[1], (esize)) && \
+     ((abs_ptrdiff(args[2], args[1]) >= (vsize)) || \
+      (abs_ptrdiff(args[2], args[1]) == 0)) && \
+     abs_ptrdiff(args[2], args[0]) >= (esize))
 
 #define IS_BLOCKABLE_BINARY_SCALAR2_BOOL(esize, vsize) \
     (steps[0] == (esize) && steps[1] == 0 && steps[2] == (1) && \
-     npy_is_aligned(args[0], (esize)))
+     npy_is_aligned(args[0], (esize)) && \
+     ((abs_ptrdiff(args[2], args[0]) >= (vsize)) || \
+      (abs_ptrdiff(args[2], args[0]) == 0)) && \
+     abs_ptrdiff(args[2], args[1]) >= (esize))
 
 /* align var to alignment */
 #define LOOP_BLOCK_ALIGN_VAR(var, type, alignment)\

In any case either seems fine to me if you have a preference for this version. But lets delete the faulty macro's if we go this route?

seiko2plus · 2022-12-22T03:15:47Z

I don't like to hide memory overlap tests behind macros, just one check at the beginning of the function and then using contiguous macros or going raw sounds safe and clean to me, we already follow this way for most of our dispatchable sources, therefore I deleted these macros since they are no longer used anywhere.

seberg · 2022-12-22T08:09:33Z

Yeah, I can see a point in not hiding them in a macro that is used for other things especially. Lets put this in then, it is indeed a pattern used elsewhere.

Thanks for the quick fix and sleuthing in what was wrong!

seberg · 2022-12-22T08:11:48Z

numpy/core/tests/test_umath.py

+    uf for uf in UFUNCS if uf.nin == 2
+]
+UFUNCS_BINARY_ACC = [
+    uf for uf in UFUNCS_BINARY if hasattr(uf, "accumulate") and uf.nout == 1


Btw. the hasattr shouldn't do anything right now. Checking here that ufunc.signature is None would make more sense to me.
But its mainly a nitpick and the rest is important for backporting.

Thanks for clarification, in that case, the following skip can be avoided:

numpy/numpy/core/tests/test_umath.py

Lines 4356 to 4357 in 1a0ea79

if ufunc.signature:

pytest.skip('For generic signatures only')

seiko2plus added 00 - Bug 09 - Backport-Candidate PRs tagged should be backported component: SIMD Issues in SIMD (fast instruction sets) code or machinery labels Dec 21, 2022

BUG, SIMD: Fix memory overlap on comparison accumulate

2862a1c

seiko2plus force-pushed the issue_22841 branch from 7a887b6 to 2862a1c Compare December 21, 2022 10:05

seiko2plus marked this pull request as ready for review December 21, 2022 10:12

seiko2plus changed the title ~~BUG, SIMD: Fix memory overlap on comparison accumulate~~ BUG, SIMD: Fix memory overlap in comparison ufunc loops Dec 21, 2022

seiko2plus changed the title ~~BUG, SIMD: Fix memory overlap in comparison ufunc loops~~ BUG, SIMD: Fix memory overlap in ufunc comparison loops Dec 21, 2022

MAINT: Remove unused umath macros IS_BLOCKABLE_BINARY_BOOL*

1a0ea79

seberg merged commit ae81e65 into numpy:main Dec 22, 2022

seberg reviewed Dec 22, 2022

View reviewed changes

charris mentioned this pull request Dec 22, 2022

BUG, SIMD: Fix memory overlap in ufunc comparison loops #22867

Merged

charris removed the 09 - Backport-Candidate PRs tagged should be backported label Dec 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG, SIMD: Fix memory overlap in ufunc comparison loops#22851

BUG, SIMD: Fix memory overlap in ufunc comparison loops#22851
seberg merged 2 commits intonumpy:mainfrom
seiko2plus:issue_22841

seiko2plus commented Dec 21, 2022

Uh oh!

seberg commented Dec 21, 2022

Uh oh!

seberg commented Dec 21, 2022

Uh oh!

seberg commented Dec 21, 2022

Uh oh!

seiko2plus commented Dec 22, 2022 •

edited

Loading

Uh oh!

seberg commented Dec 22, 2022

Uh oh!

seberg Dec 22, 2022

Uh oh!

seiko2plus Dec 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if ufunc.signature:
	pytest.skip('For generic signatures only')

Uh oh!

Conversation

seiko2plus commented Dec 21, 2022

Uh oh!

seberg commented Dec 21, 2022

Uh oh!

seberg commented Dec 21, 2022

Uh oh!

seberg commented Dec 21, 2022

Uh oh!

seiko2plus commented Dec 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Dec 22, 2022

Uh oh!

seberg Dec 22, 2022

Choose a reason for hiding this comment

Uh oh!

seiko2plus Dec 23, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

seiko2plus commented Dec 22, 2022 •

edited

Loading