Skip to content

Fix vectorized calculations on POWER#59382

Closed
Flamefire wants to merge 7 commits intopytorch:masterfrom
Flamefire:fix_vsx_master
Closed

Fix vectorized calculations on POWER#59382
Flamefire wants to merge 7 commits intopytorch:masterfrom
Flamefire:fix_vsx_master

Conversation

@Flamefire
Copy link
Copy Markdown
Collaborator

This fixes multiple bugs introduced by the VSX optimized code in #41541

  • min/max/clamp now consistently return nan when any value is NaN as on other architectures
  • The non-complex angle functions return PI for negative values now
  • The complex angle functions have been corrected and optimized
  • The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64

Fixes #59248
Fixes #57537

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Jun 3, 2021

💊 CI failures summary and remediations

As of commit f34967a (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-scanned failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@mruberry mruberry requested review from VitalyFedyunin and ezyang June 4, 2021 13:31
@mruberry mruberry added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 4, 2021
@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Jun 4, 2021

Since we don't have any POWER CI, do you mind explaining briefly how you tested the changes here? (Just for posterity)

@ezyang ezyang requested a review from anjali411 June 4, 2021 16:21
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, what's going on here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch sorry. I made that patch with 1.8.1 and ported it to master where the vec256 class was renamed but not the paths (which I find confusing) will fix this.

@Flamefire
Copy link
Copy Markdown
Collaborator Author

Since we don't have any POWER CI, do you mind explaining briefly how you tested the changes here? (Just for posterity)

I compiled as usual and then run test_unary_ufuncs.py and test_binary_ufuncs.py successfully which failed before (see the linked issues)

BTW: I found some missing test coverage: The NaN propagation of min/max and the clamp/clamp_min/clamp_max function is not fully covered. I.e. that min/max return NaN when either is NaN but clamp only returns NaN when the second is. So clamp_min!=max which I missed at first and it made another test fail with something hard to find. See dd62c89

@Flamefire Flamefire force-pushed the fix_vsx_master branch 2 times, most recently from f219f92 to 34716f1 Compare June 7, 2021 08:07
Flamefire added 7 commits June 7, 2021 11:02
vec_min/max does not propagate NaNs, so take the Eigen implementation
using asm to have x86 semantics.
They need to return PI for negative values
A typo caused every second value to be wrong.
Also only calculate what is required, i.e. every second value
Same as float64 and fixes failures due to e.g. wrong treatment of inf
The clamp functions should return the first argument if any is nan
@Flamefire
Copy link
Copy Markdown
Collaborator Author

@ezyang I rebased onto master and recompiled that, then ran the 2 test files. All succeeds which hasn't before but getting 2 new failures on master:

  • test_reference_numerics_hard_atanh_cpu_complex64
  • test_complex_edge_values_cpu_complex64

Those refer to acos and atanh which dispatch to the std variants and those return nan/inf for the tested input values (on PPC and x86) as can be checked with:

#include <iostream>
#include <complex>

int main(){
  auto bar = std::acos(std::complex<float>(0, 1e+20));
  auto baz = std::atanh(std::complex<float>(-501, 1e+20));
  std::cout << bar << std::endl;
  std::cout << baz << std::endl;
}

--> (1.5708,-inf) (-nan,1.5708)

I'm not sure if this is changed on x86 already to dispatch to something else which would need to be done too on PPC, maybe @malfet can comment here? Seen him on the related issue #42952

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Jun 7, 2021

Seems OK to leave those fixes for later. acos seems related to #52310

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang merged this pull request in 40cbf34.

@Flamefire Flamefire deleted the fix_vsx_master branch June 9, 2021 06:35
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
This fixes multiple bugs introduced by the VSX optimized code in pytorch#41541

- min/max/clamp now consistently return nan when any value is NaN as on other architectures
- The non-complex angle functions return PI for negative values now
- The complex angle functions have been corrected and optimized
- The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64

Fixes pytorch#59248
Fixes pytorch#57537

Pull Request resolved: pytorch#59382

Reviewed By: jbschlosser

Differential Revision: D28944626

Pulled By: ezyang

fbshipit-source-id: 1ae2782b9e34e458a19cec90617037654279e0e0
pytorchmergebot pushed a commit that referenced this pull request Aug 3, 2022
This fixes the remaining bug introduced by the VSX optimized code in #41541

Followup to #59382

### Description

The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail.

### Testing

Build and ran tests on PPC
Pull Request resolved: #82646
Approved by: https://github.com/ezyang
facebook-github-bot pushed a commit that referenced this pull request Aug 4, 2022
Summary:
This fixes the remaining bug introduced by the VSX optimized code in #41541

Followup to #59382

### Description

The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail.

### Testing

Build and ran tests on PPC

Pull Request resolved: #82646
Approved by: https://github.com/ezyang

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/39ffad392c49aafcfeba05e2704bb1b666247471

Reviewed By: kit1980

Differential Revision: D38395263

fbshipit-source-id: c4c56af2d8e3b528b6418a4a32c63de77037e5cf
pytorchmergebot pushed a commit to Flamefire/pytorch that referenced this pull request Oct 10, 2022
Replace the remaining hand-written code in vec256_float_vsx.h by calls
to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541
Also remove some whitespace wrongly added in the above PRs.

This fixes wrong results for e.g. `sin(1e20)`.
Fixes pytorch#85978
Flamefire added a commit to Flamefire/pytorch that referenced this pull request Nov 22, 2022
Replace the remaining hand-written code in vec256_float_vsx.h by calls
to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541
Also remove some whitespace wrongly added in the above PRs.

This fixes wrong results for e.g. `sin(1e20)`.
Fixes pytorch#85978
pytorchmergebot pushed a commit that referenced this pull request Nov 22, 2022
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in #59382 & #82646 after #41541

This fixes wrong results for e.g. `sin(1e20)`.
Fixes #85978

To fix #85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done.

Pull Request resolved: #86453
Approved by: https://github.com/malfet
kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541

This fixes wrong results for e.g. `sin(1e20)`.
Fixes pytorch#85978

To fix pytorch#85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done.

Pull Request resolved: pytorch#86453
Approved by: https://github.com/malfet
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
Summary:
This fixes multiple bugs introduced by the VSX optimized code in pytorch#41541

- min/max/clamp now consistently return nan when any value is NaN as on other architectures
- The non-complex angle functions return PI for negative values now
- The complex angle functions have been corrected and optimized
- The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64

Fixes pytorch#59248
Fixes pytorch#57537

Pull Request resolved: pytorch#59382

Reviewed By: jbschlosser

Differential Revision: D28944626

Pulled By: ezyang

fbshipit-source-id: 1ae2782b9e34e458a19cec90617037654279e0e0
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
This fixes the remaining bug introduced by the VSX optimized code in pytorch#41541

Followup to pytorch#59382

### Description

The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail.

### Testing

Build and ran tests on PPC
Pull Request resolved: pytorch#82646
Approved by: https://github.com/ezyang
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541

This fixes wrong results for e.g. `sin(1e20)`.
Fixes pytorch#85978

To fix pytorch#85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done.

Pull Request resolved: pytorch#86453
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple failures in test_unary_ufuncs on POWER Nan propagation of min/max function broken for POWER

5 participants