Skip to content

[inductor] Fix pow precision helper for fp64 inputs#175268

Closed
mlazos wants to merge 15 commits intogh/mlazos/105/basefrom
gh/mlazos/105/head
Closed

[inductor] Fix pow precision helper for fp64 inputs#175268
mlazos wants to merge 15 commits intogh/mlazos/105/basefrom
gh/mlazos/105/head

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 18, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175268

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 406bf9c with merge base 197c376 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 18, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

mlazos added a commit that referenced this pull request Feb 18, 2026
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.

Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.

Co-authored-by: Claude <noreply@anthropic.com>
ghstack-source-id: 8334fa8
Pull-Request: #175268
mlazos added a commit that referenced this pull request Feb 18, 2026
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.

Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.

Co-authored-by: Claude <noreply@anthropic.com>
ghstack-source-id: 8334fa8
Pull-Request: #175268
[ghstack-poisoned]
@mlazos mlazos requested a review from eellison February 19, 2026 02:03
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@mlazos mlazos requested a review from v0i0 February 19, 2026 19:54
[ghstack-poisoned]
[ghstack-poisoned]
mlazos added a commit that referenced this pull request Feb 21, 2026
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.

Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.

Co-authored-by: Claude <noreply@anthropic.com>
ghstack-source-id: c16a2a3
Pull-Request: #175268
[ghstack-poisoned]
mlazos added a commit that referenced this pull request Feb 21, 2026
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.

Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.

Co-authored-by: Claude <noreply@anthropic.com>
ghstack-source-id: c16a2a3
Pull-Request: #175268
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link
Copy Markdown
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple easily addressable comments

Comment on lines +2736 to +2745
def test_pow_precision_fp64(self):
# Test that pow matches eager bitwise for fp64.
# libdevice.pow matches CUDA's pow for fp64 (no FTZ issues).
def fn(base, exp):
return torch.pow(base, exp)

base = torch.tensor([0.9, 0.999, 0.5, 0.8], device="cuda", dtype=torch.float64)
exp = torch.tensor(
[50.0, 100.0, 10.0, 20.0], device="cuda", dtype=torch.float64
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please, parametrize this across all dtypes ? this would be another good time to unify around @v0i0 test infra

@maybe_upcast_float32()
def pow(a, b):
# Check dtype before potential upcast - powf_cuda only supports fp32
a_dtype = getattr(a, "dtype", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: isinstance(var, CSEVariable) and var.dtype in (torch.float16, torch.bfloat16) to be more explicit

@staticmethod
@maybe_upcast_float32()
def _pow_impl(a, b):
if config.eager_numerics.pow_precision:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why dont we just check that the dtypes are float32 here ? could they also be integers ?

[ghstack-poisoned]
@mlazos
Copy link
Copy Markdown
Contributor Author

mlazos commented Feb 26, 2026

closing - no longer needed

@mlazos mlazos closed this Feb 26, 2026
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.

Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.

Co-authored-by: Claude <noreply@anthropic.com>
ghstack-source-id: 7af7efc
Pull-Request: pytorch/pytorch#175268
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
The powf_cuda inline PTX helper only supports fp32 inputs. For fp64,
fall back to libdevice.pow which already matches eager exactly.

Also adds test_pow_precision_fp64 to verify fp64 pow matches eager.

Co-authored-by: Claude <noreply@anthropic.com>
ghstack-source-id: 98c9b3d
Pull-Request: pytorch/pytorch#175268
@github-actions github-actions Bot deleted the gh/mlazos/105/head branch March 29, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants