[inductor] Fix reciprocal to use float32 for division_rounding by mlazos · Pull Request #174751 · pytorch/pytorch

mlazos · 2026-02-11T03:22:45Z

Stack from ghstack (oldest at bottom):

Use float32 constant instead of int for reciprocal to ensure proper
floating-point division when emulating eager division rounding.

Test: test_div_precision_rounding in test_cuda_repro.py

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

[ghstack-poisoned]

pytorch-bot · 2026-02-11T03:22:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174751

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 119e900 with merge base 197c376 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / linux-jammy-rocm-py3.10 / test (default, 5, 6, linux.rocm.gpu.gfx950.1) (gh) (similar failure)
test/inductor/test_combo_kernels.py::ComboKernelTests::test_combo_kernel_dynamic_shapes_grid_changes

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 2, 2, linux.2xlarge.amx, unstable) (gh) (#174929)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py ghstack-source-id: 87aec4b Pull-Request: #174751

[ghstack-poisoned]

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py ghstack-source-id: dd633c7 Pull-Request: #174751

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py ghstack-source-id: 87aec4b Pull-Request: #174751

[ghstack-poisoned]

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py ghstack-source-id: 49967f5 Pull-Request: #174751

[ghstack-poisoned]

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py ghstack-source-id: 0edd804 Pull-Request: #174751

[ghstack-poisoned]

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. The OpDecompositions.reciprocal method decomposes reciprocal(x) to 1 / x. When eager_numerics.division_rounding is enabled, truediv checks if both operands are float32 before using div_rn. With an int32 constant, this check fails and regular division is used, causing ~13% of values to differ from eager. Test: test_reciprocal_precision_rounding in test_cuda_repro.py ghstack-source-id: a658442 Pull-Request: #174751

pytorchmergebot · 2026-02-26T22:09:32Z

Merge started

Your change will be merged while ignoring the following 3 checks: inductor / unit-test / inductor-test / test (inductor, 2, 2, linux.g5.4xlarge.nvidia.gpu), inductor / unit-test / inductor-test / test (inductor, 1, 2, linux.g5.4xlarge.nvidia.gpu), trunk / win-vs2022-cuda12.8-py3 / build

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-02-27T00:03:18Z

Merge failed

Reason: 1 jobs have failed, first few of them are: linux-aarch64 / linux-jammy-aarch64-py3.10 / test (openreg, 1, 1, lf.linux.arm64.m8g.4xlarge)

Details for Dev Infra team

Raised by workflow job

mlazos · 2026-02-27T00:45:11Z

@pytorchbot merge -f "unrelated failures"

pytorchmergebot · 2026-02-27T00:47:05Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

#174751)" This reverts commit 1b9046a. Reverted #174751 on behalf of https://github.com/jeanschmidt due to Need to revert in order to revert #175555 - see D94699526 ([comment](#174933 (comment)))

pytorchmergebot · 2026-02-27T23:22:53Z

@mlazos your PR has been reverted as part of the stack under #174933.

mlazos · 2026-02-28T02:08:02Z

@pytorchbot merge

pytorchmergebot · 2026-02-28T02:10:19Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

PR #174751 fixed reciprocal to use float32 for division_rounding, which makes these xfails stale under the inductor_numerics backend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

PR #174751 fixed reciprocal to use float32 for division_rounding, which makes these xfails stale under the inductor_numerics backend.

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. The OpDecompositions.reciprocal method decomposes reciprocal(x) to 1 / x. When eager_numerics.division_rounding is enabled, truediv checks if both operands are float32 before using div_rn. With an int32 constant, this check fails and regular division is used, causing ~13% of values to differ from eager. Test: test_reciprocal_precision_rounding in test_cuda_repro.py ghstack-source-id: 67e2eb4 Pull-Request: pytorch/pytorch#174751

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. The OpDecompositions.reciprocal method decomposes reciprocal(x) to 1 / x. When eager_numerics.division_rounding is enabled, truediv checks if both operands are float32 before using div_rn. With an int32 constant, this check fails and regular division is used, causing ~13% of values to differ from eager. Test: test_reciprocal_precision_rounding in test_cuda_repro.py ghstack-source-id: 9bf7138 Pull-Request: pytorch/pytorch#174751

Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. The OpDecompositions.reciprocal method decomposes reciprocal(x) to 1 / x. When eager_numerics.division_rounding is enabled, truediv checks if both operands are float32 before using div_rn. With an int32 constant, this check fails and regular division is used, causing ~13% of values to differ from eager. Test: test_reciprocal_precision_rounding in test_cuda_repro.py ghstack-source-id: 4f8ac82 Pull-Request: pytorch/pytorch#174751

…ch#174751) Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py Pull Request resolved: pytorch#174751 Approved by: https://github.com/v0i0 ghstack dependencies: pytorch#174749, pytorch#174933

pytorch#174751)" This reverts commit 1b9046a. Reverted pytorch#174751 on behalf of https://github.com/jeanschmidt due to Need to revert in order to revert pytorch#175555 - see D94699526 ([comment](pytorch#174933 (comment)))

…ch#174751) Use float32 constant instead of int for reciprocal to ensure proper floating-point division when emulating eager division rounding. Test: test_div_precision_rounding in test_cuda_repro.py Pull Request resolved: pytorch#174751 Approved by: https://github.com/v0i0 ghstack dependencies: pytorch#174749, pytorch#174933

Update

33784ad

[ghstack-poisoned]

pytorch-bot Bot added ciflow/inductor module: inductor labels Feb 11, 2026

This was referenced Feb 11, 2026

[inductor] Add FMA-based lerp lowering for CUDA parity #174749

Closed

[inductor] Add pow_precision config for eager numerics #174750

Closed

Update

6da1830

[ghstack-poisoned]

mlazos added the release notes: inductor label Feb 11, 2026

Update

445c8c8

[ghstack-poisoned]

mlazos mentioned this pull request Feb 11, 2026

[inductor] Add compiled Adam bitwise test #174815

Closed

mlazos requested a review from v0i0 February 11, 2026 21:45

v0i0 approved these changes Feb 12, 2026

View reviewed changes

ngimel reviewed Feb 12, 2026

View reviewed changes

Comment thread test/inductor/test_cuda_repro.py Outdated

This was referenced Feb 12, 2026

[inductor] Skip addcdiv decomposition for AdamW bitwise precision #174910

Closed

[inductor] Add bitwise tests for compiled Adam/AdamW #174911

Closed

[inductor] Add FMA-based addcdiv lowering for CUDA parity #174912

Closed

mlazos added 2 commits February 12, 2026 15:01

Update

72e80ef

[ghstack-poisoned]

Update

5a5317d

[ghstack-poisoned]

mlazos mentioned this pull request Feb 13, 2026

[inductor] Use CUDA toolkit libdevice for Triton #174933

Closed

Update

0998a01

[ghstack-poisoned]

Update

dca2e80

[ghstack-poisoned]

mlazos mentioned this pull request Feb 18, 2026

[inductor] Add inline PTX pow for bitwise CUDA parity #175227

Closed

pytorchmergebot removed the merging label Feb 27, 2026

pytorchmergebot added the merging label Feb 27, 2026

pytorchmergebot added the Merged label Feb 27, 2026

pytorchmergebot closed this in 1b9046a Feb 27, 2026

pytorchmergebot removed the merging label Feb 27, 2026

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Feb 27, 2026

pytorchmergebot reopened this Feb 27, 2026

pytorchmergebot added the merging label Feb 28, 2026

pytorchmergebot closed this in a88bb12 Feb 28, 2026

pytorchmergebot removed the merging label Feb 28, 2026

v0i0 added a commit that referenced this pull request Mar 3, 2026

Remove reciprocal from inductor_numerics xfails

cd88366

PR #174751 fixed reciprocal to use float32 for division_rounding, which makes these xfails stale under the inductor_numerics backend.

github-actions Bot deleted the gh/mlazos/97/head branch March 31, 2026 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] Fix reciprocal to use float32 for division_rounding#174751

[inductor] Fix reciprocal to use float32 for division_rounding#174751
mlazos wants to merge 29 commits intogh/mlazos/97/basefrom
gh/mlazos/97/head

mlazos commented Feb 11, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

pytorchmergebot commented Feb 26, 2026

Uh oh!

pytorchmergebot commented Feb 27, 2026

Uh oh!

mlazos commented Feb 27, 2026

Uh oh!

pytorchmergebot commented Feb 27, 2026

Uh oh!

pytorchmergebot commented Feb 27, 2026

Uh oh!

mlazos commented Feb 28, 2026

Uh oh!

pytorchmergebot commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mlazos commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174751

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

Uh oh!

pytorchmergebot commented Feb 26, 2026

Merge started

Uh oh!

pytorchmergebot commented Feb 27, 2026

Merge failed

Uh oh!

mlazos commented Feb 27, 2026

Uh oh!

pytorchmergebot commented Feb 27, 2026

Merge started

Uh oh!

pytorchmergebot commented Feb 27, 2026

Uh oh!

mlazos commented Feb 28, 2026

Uh oh!

pytorchmergebot commented Feb 28, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mlazos commented Feb 11, 2026 •

edited

Loading

pytorch-bot Bot commented Feb 11, 2026 •

edited

Loading