Fix vectorized calculations on POWER#59382
Conversation
💊 CI failures summary and remediationsAs of commit f34967a (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
Since we don't have any POWER CI, do you mind explaining briefly how you tested the changes here? (Just for posterity) |
There was a problem hiding this comment.
Ouch sorry. I made that patch with 1.8.1 and ported it to master where the vec256 class was renamed but not the paths (which I find confusing) will fix this.
I compiled as usual and then run BTW: I found some missing test coverage: The NaN propagation of min/max and the clamp/clamp_min/clamp_max function is not fully covered. I.e. that min/max return NaN when either is NaN but clamp only returns NaN when the second is. So clamp_min!=max which I missed at first and it made another test fail with something hard to find. See dd62c89 |
f219f92 to
34716f1
Compare
vec_min/max does not propagate NaNs, so take the Eigen implementation using asm to have x86 semantics.
They need to return PI for negative values
A typo caused every second value to be wrong. Also only calculate what is required, i.e. every second value
Same as float64 and fixes failures due to e.g. wrong treatment of inf
The clamp functions should return the first argument if any is nan
|
@ezyang I rebased onto master and recompiled that, then ran the 2 test files. All succeeds which hasn't before but getting 2 new failures on master:
Those refer to acos and atanh which dispatch to the std variants and those return nan/inf for the tested input values (on PPC and x86) as can be checked with: --> I'm not sure if this is changed on x86 already to dispatch to something else which would need to be done too on PPC, maybe @malfet can comment here? Seen him on the related issue #42952 |
|
Seems OK to leave those fixes for later. acos seems related to #52310 |
|
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: This fixes multiple bugs introduced by the VSX optimized code in pytorch#41541 - min/max/clamp now consistently return nan when any value is NaN as on other architectures - The non-complex angle functions return PI for negative values now - The complex angle functions have been corrected and optimized - The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64 Fixes pytorch#59248 Fixes pytorch#57537 Pull Request resolved: pytorch#59382 Reviewed By: jbschlosser Differential Revision: D28944626 Pulled By: ezyang fbshipit-source-id: 1ae2782b9e34e458a19cec90617037654279e0e0
This fixes the remaining bug introduced by the VSX optimized code in #41541 Followup to #59382 ### Description The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail. ### Testing Build and ran tests on PPC Pull Request resolved: #82646 Approved by: https://github.com/ezyang
Summary: This fixes the remaining bug introduced by the VSX optimized code in #41541 Followup to #59382 ### Description The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail. ### Testing Build and ran tests on PPC Pull Request resolved: #82646 Approved by: https://github.com/ezyang Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/39ffad392c49aafcfeba05e2704bb1b666247471 Reviewed By: kit1980 Differential Revision: D38395263 fbshipit-source-id: c4c56af2d8e3b528b6418a4a32c63de77037e5cf
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 Also remove some whitespace wrongly added in the above PRs. This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 Also remove some whitespace wrongly added in the above PRs. This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in #59382 & #82646 after #41541 This fixes wrong results for e.g. `sin(1e20)`. Fixes #85978 To fix #85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done. Pull Request resolved: #86453 Approved by: https://github.com/malfet
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978 To fix pytorch#85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done. Pull Request resolved: pytorch#86453 Approved by: https://github.com/malfet
Summary: This fixes multiple bugs introduced by the VSX optimized code in pytorch#41541 - min/max/clamp now consistently return nan when any value is NaN as on other architectures - The non-complex angle functions return PI for negative values now - The complex angle functions have been corrected and optimized - The float32-log function implementation returned a wrong result when inf was passed (and maybe other inputs), replaced by the sleef function just as for float64 Fixes pytorch#59248 Fixes pytorch#57537 Pull Request resolved: pytorch#59382 Reviewed By: jbschlosser Differential Revision: D28944626 Pulled By: ezyang fbshipit-source-id: 1ae2782b9e34e458a19cec90617037654279e0e0
This fixes the remaining bug introduced by the VSX optimized code in pytorch#41541 Followup to pytorch#59382 ### Description The code currently returns wrong results on POWER9LE making e.g. the `test_binary_ufuncs` fail. ### Testing Build and ran tests on PPC Pull Request resolved: pytorch#82646 Approved by: https://github.com/ezyang
Replace the remaining hand-written code in vec256_float_vsx.h by calls to Sleef functions similar to what was done in pytorch#59382 & pytorch#82646 after pytorch#41541 This fixes wrong results for e.g. `sin(1e20)`. Fixes pytorch#85978 To fix pytorch#85978 I only needed to do the sin/cos functions to make the test pass but to not encounter the same issue again and again (see the previous PRs and issues) I checked the whole file for similar functions where a Sleef function could be used and changed those too. In the diff I've noticed the faulty whitespace so to make this complete I fixed that too, so it should now be done. Pull Request resolved: pytorch#86453 Approved by: https://github.com/malfet
This fixes multiple bugs introduced by the VSX optimized code in #41541
Fixes #59248
Fixes #57537