Skip to content

BUG: Fix for npyv__trunc_s32_f32 (VXE)#23077

Merged
seiko2plus merged 2 commits intonumpy:mainfrom
pradghos:fix_sin_cos_vxe
Jan 25, 2023
Merged

BUG: Fix for npyv__trunc_s32_f32 (VXE)#23077
seiko2plus merged 2 commits intonumpy:mainfrom
pradghos:fix_sin_cos_vxe

Conversation

@pradghos
Copy link
Copy Markdown
Contributor

@pradghos pradghos commented Jan 24, 2023

np.sin(), np.cos() are giving erroneous result for float32 (VXE).
This PR is fixing npyv__trunc_s32_f32(npyv_f32 a) to resolve the issue.

np.sin() reproduce (without fix)

>>> c = np.array( [-25.091976, 90.14286], dtype=np.float32)
>>> np.sin(c)
array([0.04075377, 0.5707917 ], dtype=float32)
>>> 

>>> c = np.array( [90.14286], dtype=np.float32)
>>> np.sin(c)
array([0.8210949], dtype=float32)
>>>

>>> c = np.array( [-25.091976,  90.14286,   46.39879], dtype=np.float32)
>>> np.sin(c)
array([ 0.04075377, -0.5707917 ,  0.6632113 ], dtype=float32)  

After the fix:

>>> import numpy as np
>>> c = np.array( [-25.091976, 90.14286], dtype=np.float32)
>>> np.sin(c)
array([0.04075377, 0.8210949 ], dtype=float32)

>>> c = np.array( [90.14286], dtype=np.float32)
>>> np.sin(c)
array([0.8210949], dtype=float32)

>>> c = np.array( [-25.091976,  90.14286,   46.39879], dtype=np.float32)
>>> np.sin(c)
array([0.04075377, 0.8210949 , 0.6632113 ], dtype=float32)
>>>

Test failure without fix -

FAILED ../../core/tests/test_umath.py::TestAVXFloat32Transcendental::test_sincos_float32 - AssertionError: Arrays are not almost equal up to 2 ULP (max difference is 2.13071e+09 ULP)
FAILED ../../core/tests/test_umath.py::TestAVXFloat32Transcendental::test_strided_float32 - AssertionError: X and Y are not equal to 2 ULP (max is 6.79208e+06)
FAILED ../../core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_fp16_transcendentals[cos] - AssertionError: Arrays are not almost equal up to 1 ULP (max difference is 30720 ULP)
FAILED ../../core/tests/test_umath_accuracy.py::TestAccuracy::test_validate_fp16_transcendentals[sin] - AssertionE

cc @Andreas-Krebbel @potula-chandra @andrewsi-z

np.sin(), np.cos() are giving erroneous result for float32 (VXE)
This PR is fixing `npyv_s32 npyv__trunc_s32_f32(npyv_f32 a)`
to resolve the issue.
@pradghos pradghos changed the title BUG: Fix for npyv_s32 npyv__trunc_s32_f32 (VXE) BUG: Fix for npyv__trunc_s32_f32 (VXE) Jan 24, 2023
Copy link
Copy Markdown

@Andreas-Krebbel Andreas-Krebbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace __builtin_s390_vflls with npyv_doublee again. __builtin_s390_vflls will fail to build with clang. The npyv_doublee definition already has the logic to distingiush between gcc and clang.

#elif defined(NPY_HAVE_VXE)
return vec_packs(vec_signed(npyv_doublee(a)), vec_signed(npyv_doublee(vec_mergel(a, a))));
return vec_packs(vec_signed(__builtin_s390_vflls(vec_mergeh(a,a))),
vec_signed(__builtin_s390_vflls(vec_mergel(a, a))));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that npyv_doublee is simply a
#define npyv_doublee __builtin_s390_vflls
in utils.h So it would have been fine to continue using npyv_doublee instead of __builtin_s390_vflls

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Andreas-Krebbel ! I have updated in latest commit.

@pradghos
Copy link
Copy Markdown
Contributor Author

Hi @seiko2plus, Please let us know your comments. Thank you !

@seiko2plus seiko2plus added 06 - Regression 09 - Backport-Candidate PRs tagged should be backported component: SIMD Issues in SIMD (fast instruction sets) code or machinery labels Jan 25, 2023
@seiko2plus seiko2plus merged commit 930addb into numpy:main Jan 25, 2023
@seiko2plus
Copy link
Copy Markdown
Member

@pradghos, LGTM, Thanks for the fix

@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

00 - Bug 06 - Regression component: SIMD Issues in SIMD (fast instruction sets) code or machinery

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants