[inductor] Fix pow precision helper for fp64 inputs#175268
[inductor] Fix pow precision helper for fp64 inputs#175268mlazos wants to merge 15 commits intogh/mlazos/105/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175268
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 406bf9c with merge base 197c376 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
eellison
left a comment
There was a problem hiding this comment.
couple easily addressable comments
| def test_pow_precision_fp64(self): | ||
| # Test that pow matches eager bitwise for fp64. | ||
| # libdevice.pow matches CUDA's pow for fp64 (no FTZ issues). | ||
| def fn(base, exp): | ||
| return torch.pow(base, exp) | ||
|
|
||
| base = torch.tensor([0.9, 0.999, 0.5, 0.8], device="cuda", dtype=torch.float64) | ||
| exp = torch.tensor( | ||
| [50.0, 100.0, 10.0, 20.0], device="cuda", dtype=torch.float64 | ||
| ) |
There was a problem hiding this comment.
can you please, parametrize this across all dtypes ? this would be another good time to unify around @v0i0 test infra
| @maybe_upcast_float32() | ||
| def pow(a, b): | ||
| # Check dtype before potential upcast - powf_cuda only supports fp32 | ||
| a_dtype = getattr(a, "dtype", None) |
There was a problem hiding this comment.
nit: isinstance(var, CSEVariable) and var.dtype in (torch.float16, torch.bfloat16) to be more explicit
| @staticmethod | ||
| @maybe_upcast_float32() | ||
| def _pow_impl(a, b): | ||
| if config.eager_numerics.pow_precision: |
There was a problem hiding this comment.
why dont we just check that the dtypes are float32 here ? could they also be integers ?
|
closing - no longer needed |
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64, fall back to libdevice.pow which already matches eager exactly. Also adds test_pow_precision_fp64 to verify fp64 pow matches eager. Co-authored-by: Claude <noreply@anthropic.com> ghstack-source-id: 7af7efc Pull-Request: pytorch/pytorch#175268
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64, fall back to libdevice.pow which already matches eager exactly. Also adds test_pow_precision_fp64 to verify fp64 pow matches eager. Co-authored-by: Claude <noreply@anthropic.com> ghstack-source-id: 98c9b3d Pull-Request: pytorch/pytorch#175268
Stack from ghstack (oldest at bottom):
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.
Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.
Co-authored-by: Claude noreply@anthropic.com
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo