Skip to content

BUG: Compiler-options-dependent bug in np.square for complex numbers affecting numpy-1.26.4 on macOS on ARM #26940

Description

@yairchu

Describe the issue:

A simple program which calls np.square twice on the exact same input vector produces different results for two consecutive calls.

IIUC (which I'm not sure about) this happens because:

  • The first result vector allocated for np.square gets to sit close the input vector allocated briefly before it, which makes it fail the is_mem_overlap check and fall back to CDOUBLE_square's loop_scalar
  • loop_scalar is plain C code which may or may not use fused-multiply-add depending on compiler versions or options
  • The simd code is used to produce the second result, and it does use fused-multiply-add regardless of compilers

Reproduce the code example:

import numpy as np

vec = np.array([-5.171866611150749e-07 + 2.5618634555957426e-07j, 0, 0])

def compute():
    return np.square(vec)

first_res = compute()
second_res = compute()

print(
    "Results are consistent."
    if (first_res == second_res).all()
    else "INCONSISTENT!"
)
print("Difference:", second_res - first_res)

Error message:

INCONSISTENT!
Difference: [2.5243549e-29+0.j 0.0000000e+00+0.j 0.0000000e+00+0.j]

Python and NumPy Versions:

1.26.4
3.12.4 (main, Jun 6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)]

Runtime Environment:

[{'numpy_version': '1.26.4',
'python': '3.12.4 (main, Jun 6 2024, 18:26:44) [Clang 15.0.0 '
'(clang-1500.3.9.4)]',
'uname': uname_result(system='Darwin', node='Sounds-MacBook-Pro.local', release='23.4.0', version='Darwin Kernel Version 23.4.0: Fri Mar 15 00:10:42 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6000', machine='arm64')},
{'simd_extensions': {'baseline': ['NEON', 'NEON_FP16', 'NEON_VFPV4', 'ASIMD'],
'found': ['ASIMDHP'],
'not_found': ['ASIMDFHM']}}]

Context for the issue:

  • While dormant in numpy-2.0.0, I suspect this bug may reappear
    • Perhaps my example or something similar could be added as a test to verify it stays fixed
  • I suspect this issue affects more operations. IIRC I originally found it with complex vectors multiplication but minimized the example to this one.
  • For info into my investigation see https://github.com/yairchu/numpy-floats-bug
    • I failed building numpy from source to match the pip install version, so I analyzed what's going on by having a look on the assembly (my first time reading arm assembly!)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions