-
-
Notifications
You must be signed in to change notification settings - Fork 12.2k
BUG: npy_memchr has misaligned memory access #21116
Description
Describe the issue:
The function npy_memchr (in multiarray/common.h) has an optimisation for the case !NPY_ALIGNMENT_REQUIRED && needle == 0 && stride == 1 where it start by find the first nonzero 4-byte block before iterating within the block. However, the coarse-grained loop begins at haystack itself, and since the stride is 1 there's no expectation that this pointer would have any particular alignment.
While this is benign on x86_64, the misaligned access is nevertheless picked up by UBsan as it is undefined behaviour according to the C standard. npy_memchr should first scan each byte until it reaches the first 4-byte aligned block before entering the coarse-grained loop.
How to reproduce:
Build NumPy with UBsan and run a test suite containing a sufficiently large number of np.testing.assert_ calls. Since func_assert_same_pos constructs a boolean mask that is used to index into the arrays under test, with enough calls there's a decent chance that it will end up creating an unaligned boolean array at least once.
Error message:
third_party/py/numpy/core/src/multiarray/common.h:277:35: runtime error: load of misaligned address 0x... for type 'unsigned int', which requires 4 byte alignment
0x615000673205: note: pointer points here
01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
^
#0 0x... in npy_memchr third_party/py/numpy/core/src/multiarray/common.h:277:35
#1 0x... in array_boolean_subscript third_party/py/numpy/core/src/multiarray/mapping.c:1113:30
#2 0x... in array_subscript third_party/py/numpy/core/src/multiarray/mapping.c:1572:30
#3 0x... in PyObject_GetItem third_party/python_runtime/v3_7/Objects/abstract.c:182:26
...
SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use third_party/py/numpy/core/src/multiarray/common.h:277:35NumPy/Python version information:
Bug exists at HEAD.