Fix vectorized calculations on POWER by Flamefire · Pull Request #59382 · pytorch/pytorch

Flamefire · 2021-06-03T16:54:27Z

This fixes multiple bugs introduced by the VSX optimized code in #41541

min/max/clamp now consistently return nan when any value is NaN as on other architectures
The non-complex angle functions return PI for negative values now
The complex angle functions have been corrected and optimized
The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64

Fixes #59248
Fixes #57537

facebook-github-bot · 2021-06-03T16:54:33Z

💊 CI failures summary and remediations

As of commit f34967a (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-scanned failure(s)

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.2-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ezyang · 2021-06-04T16:19:02Z

Since we don't have any POWER CI, do you mind explaining briefly how you tested the changes here? (Just for posterity)

ezyang · 2021-06-04T16:24:51Z

Huh, what's going on here?

Ouch sorry. I made that patch with 1.8.1 and ported it to master where the vec256 class was renamed but not the paths (which I find confusing) will fix this.

Flamefire · 2021-06-07T06:45:34Z

Since we don't have any POWER CI, do you mind explaining briefly how you tested the changes here? (Just for posterity)

I compiled as usual and then run test_unary_ufuncs.py and test_binary_ufuncs.py successfully which failed before (see the linked issues)

BTW: I found some missing test coverage: The NaN propagation of min/max and the clamp/clamp_min/clamp_max function is not fully covered. I.e. that min/max return NaN when either is NaN but clamp only returns NaN when the second is. So clamp_min!=max which I missed at first and it made another test fail with something hard to find. See dd62c89

vec_min/max does not propagate NaNs, so take the Eigen implementation using asm to have x86 semantics.

They need to return PI for negative values

A typo caused every second value to be wrong. Also only calculate what is required, i.e. every second value

Same as float64 and fixes failures due to e.g. wrong treatment of inf

The clamp functions should return the first argument if any is nan

Flamefire · 2021-06-07T10:29:05Z

@ezyang I rebased onto master and recompiled that, then ran the 2 test files. All succeeds which hasn't before but getting 2 new failures on master:

test_reference_numerics_hard_atanh_cpu_complex64
test_complex_edge_values_cpu_complex64

Those refer to acos and atanh which dispatch to the std variants and those return nan/inf for the tested input values (on PPC and x86) as can be checked with:

#include <iostream>
#include <complex>

int main(){
  auto bar = std::acos(std::complex<float>(0, 1e+20));
  auto baz = std::atanh(std::complex<float>(-501, 1e+20));
  std::cout << bar << std::endl;
  std::cout << baz << std::endl;
}

--> (1.5708,-inf) (-nan,1.5708)

I'm not sure if this is changed on x86 already to dispatch to something else which would need to be done too on PPC, maybe @malfet can comment here? Seen him on the related issue #42952

ezyang · 2021-06-07T21:08:15Z

Seems OK to leave those fixes for later. acos seems related to #52310

facebook-github-bot · 2021-06-07T21:08:59Z

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-06-08T21:20:15Z

@ezyang merged this pull request in 40cbf34.

Summary: This fixes multiple bugs introduced by the VSX optimized code in pytorch#41541 - min/max/clamp now consistently return nan when any value is NaN as on other architectures - The non-complex angle functions return PI for negative values now - The complex angle functions have been corrected and optimized - The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64 Fixes pytorch#59248 Fixes pytorch#57537 Pull Request resolved: pytorch#59382 Reviewed By: jbschlosser Differential Revision: D28944626 Pulled By: ezyang fbshipit-source-id: 1ae2782b9e34e458a19cec90617037654279e0e0

This fixes the remaining bug introduced by the VSX optimized code in #41541 Followup to #59382 ### Description The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail. ### Testing Build and ran tests on PPC Pull Request resolved: #82646 Approved by: https://github.com/ezyang

Summary: This fixes the remaining bug introduced by the VSX optimized code in #41541 Followup to #59382 ### Description The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail. ### Testing Build and ran tests on PPC Pull Request resolved: #82646 Approved by: https://github.com/ezyang Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/39ffad392c49aafcfeba05e2704bb1b666247471 Reviewed By: kit1980 Differential Revision: D38395263 fbshipit-source-id: c4c56af2d8e3b528b6418a4a32c63de77037e5cf

Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 Also remove some whitespace wrongly added in the above PRs. This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978

Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in #59382 & #82646 after #41541 This fixes wrong results for e.g. `sin(1e20)`. Fixes #85978 To fix #85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done. Pull Request resolved: #86453 Approved by: https://github.com/malfet

Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978 To fix pytorch#85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done. Pull Request resolved: pytorch#86453 Approved by: https://github.com/malfet

Summary: This fixes multiple bugs introduced by the VSX optimized code in pytorch#41541 - min/max/clamp now consistently return nan when any value is NaN as on other architectures - The non-complex angle functions return PI for negative values now - The complex angle functions have been corrected and optimized - The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64 Fixes pytorch#59248 Fixes pytorch#57537 Pull Request resolved: pytorch#59382 Reviewed By: jbschlosser Differential Revision: D28944626 Pulled By: ezyang fbshipit-source-id: 1ae2782b9e34e458a19cec90617037654279e0e0

This fixes the remaining bug introduced by the VSX optimized code in pytorch#41541 Followup to pytorch#59382 ### Description The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail. ### Testing Build and ran tests on PPC Pull Request resolved: pytorch#82646 Approved by: https://github.com/ezyang

Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978 To fix pytorch#85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done. Pull Request resolved: pytorch#86453 Approved by: https://github.com/malfet

facebook-github-bot added the cla signed label Jun 3, 2021

Flamefire mentioned this pull request Jun 3, 2021

Multiple failures in test_unary_ufuncs on POWER #59248

Closed

pytorchbot added the open source label Jun 3, 2021

mruberry requested review from VitalyFedyunin and ezyang June 4, 2021 13:31

mruberry added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 4, 2021

Flamefire force-pushed the fix_vsx_master branch from 2f299bb to dd62c89 Compare June 4, 2021 14:04

ezyang requested a review from anjali411 June 4, 2021 16:21

ezyang reviewed Jun 4, 2021

View reviewed changes

Flamefire force-pushed the fix_vsx_master branch 2 times, most recently from f219f92 to 34716f1 Compare June 7, 2021 08:07

Flamefire added 7 commits June 7, 2021 11:02

Make VSX implementation of min/max/clamp NaN-correct

61a192a

vec_min/max does not propagate NaNs, so take the Eigen implementation using asm to have x86 semantics.

Fix non-complex angle functions

9c047ca

They need to return PI for negative values

Fix complex angle functions

2f6131f

A typo caused every second value to be wrong. Also only calculate what is required, i.e. every second value

Use Sleef for the float32 log functions

59b9847

Same as float64 and fixes failures due to e.g. wrong treatment of inf

Reformat

16a0da6

Rework clamp functions which should be able to eliminate NaNs

8d7a08f

The clamp functions should return the first argument if any is nan

Adapt to changes on master

f34967a

Flamefire force-pushed the fix_vsx_master branch from 34716f1 to f34967a Compare June 7, 2021 10:23

ezyang approved these changes Jun 7, 2021

View reviewed changes

facebook-github-bot closed this in 40cbf34 Jun 8, 2021

facebook-github-bot added the Merged label Jun 8, 2021

Flamefire deleted the fix_vsx_master branch June 9, 2021 06:35

Flamefire mentioned this pull request Jun 18, 2021

Failing tests in TestUnaryUfuncsCPU #60259

Closed

Flamefire mentioned this pull request Aug 2, 2022

Fix faulty, vectorized pow function on VSX #82646

Closed

Flamefire mentioned this pull request Oct 7, 2022

Fix vectorized trigonometric functions for VSX #86453

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix vectorized calculations on POWER#59382

Fix vectorized calculations on POWER#59382
Flamefire wants to merge 7 commits intopytorch:masterfrom
Flamefire:fix_vsx_master

Flamefire commented Jun 3, 2021

Uh oh!

facebook-github-bot commented Jun 3, 2021 •

edited

Loading

Uh oh!

ezyang commented Jun 4, 2021

Uh oh!

ezyang Jun 4, 2021

Uh oh!

Flamefire Jun 7, 2021

Uh oh!

Flamefire commented Jun 7, 2021

Uh oh!

Flamefire commented Jun 7, 2021

Uh oh!

ezyang commented Jun 7, 2021

Uh oh!

facebook-github-bot commented Jun 7, 2021

Uh oh!

facebook-github-bot commented Jun 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Flamefire commented Jun 3, 2021

Uh oh!

facebook-github-bot commented Jun 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

ci.pytorch.org: 1 failed

Uh oh!

ezyang commented Jun 4, 2021

Uh oh!

ezyang Jun 4, 2021

Choose a reason for hiding this comment

Uh oh!

Flamefire Jun 7, 2021

Choose a reason for hiding this comment

Uh oh!

Flamefire commented Jun 7, 2021

Uh oh!

Flamefire commented Jun 7, 2021

Uh oh!

ezyang commented Jun 7, 2021

Uh oh!

facebook-github-bot commented Jun 7, 2021

Uh oh!

facebook-github-bot commented Jun 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

facebook-github-bot commented Jun 3, 2021 •

edited

Loading