[inductor] fix allocation with deterministic guard by kshitij12345 · Pull Request #174718 · pytorch/pytorch

kshitij12345 · 2026-02-10T21:26:03Z

Co-authored with claude

cc @mruberry @kurtamohler @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

pytorch-bot · 2026-02-10T21:26:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174718

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures, 4 Unrelated Failures

As of commit 0437834 with merge base ee154ef ():

NEW FAILURES - The following jobs have failed:

inductor / unit-test / inductor-test / test (inductor_cpp_wrapper, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh)
test/inductor/test_torchinductor.py::GPUTests::test_empty_deterministic_float16_cuda
inductor / unit-test / inductor-test / test (inductor_cpp_wrapper, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh)
test/inductor/test_torchinductor.py::GPUTests::test_empty_deterministic_bfloat16_cuda
pull / linux-jammy-py3.10-clang15 / test (openreg, 1, 1, lf.linux.2xlarge) (gh)
RuntimeError: test_openreg 1/1 failed!
pull / linux-jammy-py3.10-clang18-asan / test (openreg, 1, 1, lf.linux.4xlarge) (gh)
RuntimeError: test_openreg 1/1 failed!
pull / linux-jammy-py3.10-gcc11 / test (openreg, 1, 1, lf.linux.2xlarge) (gh)
RuntimeError: test_openreg 1/1 failed!
pull / linux-jammy-py3.14-clang15 / test (openreg, 1, 1, lf.linux.2xlarge) (gh)
RuntimeError: test_openreg 1/1 failed!
pull / linux-jammy-py3.14t-clang15 / test (openreg, 1, 1, lf.linux.2xlarge) (gh)
RuntimeError: test_openreg 1/1 failed!
trunk / linux-jammy-rocm-py3.10 / test (default, 6, 6, linux.rocm.gpu.gfx950.1) (gh)
test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_complex128

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

trunk / linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx950.1) (gh) (disabled by #107893 but the issue was closed recently and a rebase is needed to make it pass)
test/inductor/test_inductor_freezing.py::FreezingCpuTests::test_conv_with_as_strided_cpu
trunk / macos-py3-arm64 / test (default, 2, 3, macos-m1-stable) (gh) (disabled by #176123 but the issue was closed recently and a rebase is needed to make it pass)
test/test_indexing.py::TestIndexingMPS::test_index_reduce_reduce_mean_mps_float32

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / inductor-test / test (inductor_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
drq

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-02-10T21:26:10Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

bdhirsh · 2026-02-10T23:20:46Z

cc @eellison / @mlazos ?

eellison · 2026-02-11T15:49:44Z

torch/_inductor/codegen/wrapper.py

@@ -3240,6 +3240,23 @@ def make_allocation(
                f"device='{device.type}', "
                f"name='{name}')"
            )
+        # With torch.utils.deterministic.fill_uninitialized_memory, we fill the buffer with NaN or MAX_INT
+        elif (


could you add a condition in elif device.type in ("cpu", "cuda", "xpu", "mtia"): here instead, so we fallback to default empty_strided? presumably that would get filled with nan

This works, thanks!

krastogi-in · 2026-02-11T17:40:27Z

@kshitij12345 , I think changes look good.

eellison · 2026-02-11T21:11:30Z

torch/_inductor/codegen/wrapper.py

+        elif device.type == "cpu" and is_pinned and not is_deterministic:
            out = (
                f"{name} = empty_strided_cpu_pinned("
                f"{codegen_allocation_shape_tuple}, "
                f"{codegen_stride_tuple}, "
                f"{dtype})"


we shouldn't be skipping pinned.

We only skip the pin_memory fast path if deterministic mode is active (otherwise we still take the fast path). Without this, the following snippet fails.

import torch from torch.testing._internal.common_utils import DeterministicGuard def fn(): return torch.empty( 4, 4, device="cpu", dtype=torch.float32, pin_memory=True ) cfunc = torch.compile(fn, fullgraph=True) with DeterministicGuard(True, fill_uninitialized_memory=True): eager = fn() compiled = cfunc() torch.testing.assert_close(eager, compiled, equal_nan=True)

kshitij12345 · 2026-02-12T01:07:05Z

@pytorchbot rebase -b viable/strict

pytorchmergebot · 2026-02-12T01:08:54Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2026-02-12T01:08:58Z

Successfully rebased inductor-empty-deterministic onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout inductor-empty-deterministic && git pull --rebase)

Fixes pytorch#174386 Pull Request resolved: pytorch#174718 Approved by: https://github.com/eellison, https://github.com/mlazos

eellison · 2026-03-12T15:32:35Z

@pytorchbot revert -c "no signal"

pytorch-bot · 2026-03-12T15:32:40Z

❌ 🤖 pytorchbot command failed:

@pytorchbot revert: error: argument -c/--classification: invalid choice: 'no signal' (choose from 'nosignal', 'ignoredsignal', 'landrace', 'weird', 'ghfirst', 'autorevert')

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst,autorevert}

Try @pytorchbot --help for more info.

eellison · 2026-03-12T15:32:54Z

@pytorchbot revert -c "nosignal"

pytorch-bot · 2026-03-12T15:32:57Z

❌ 🤖 pytorchbot command failed:

@pytorchbot revert: error: the following arguments are required: -m/--message

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst,autorevert}

Try @pytorchbot --help for more info.

eellison · 2026-03-12T15:36:14Z

@pytorchbot revert -m "inadvertently causes fill_ on inductor allocations, not just user empty_strided" -c nosignal

pytorchmergebot · 2026-03-12T15:38:11Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

This reverts commit 6d12a21. Reverted #174718 on behalf of https://github.com/eellison due to inadvertently causes fill_ on inductor allocations, not just user empty_strided ([comment](#174718 (comment)))

pytorchmergebot · 2026-03-12T15:38:18Z

@kshitij12345 your PR has been successfully reverted.

eellison · 2026-03-12T15:52:30Z

Sorry for the revert. This is inadvertently extending empty Nan Filling to inductor internal allocations, not just user empty kernels. See https://gist.github.com/eellison/8b5acce23382e7a3dec571302c7ee678. For a + b, eager doesn't fill the output. Whereas with inductor, we would be with this pr. I've also filed #177269.

kshitij12345 · 2026-03-13T09:31:18Z

Thanks for the repro, will have a look.

kshitij12345 · 2026-03-17T10:10:28Z

EDIT: Not sure about the correctness of approach 1, will check.

claude suggested 2 approaches -

Have used the first as it is more targeted (and similar to current fix). Second approach seems more correct but might require changes to IR which might have a larger radius. Wanted to know your thoughts.

Thank you!

# Skipping deterministic fill for op output buffers

## Problem

When `torch.use_deterministic_algorithms(True)` is enabled, inductor fills every
allocated buffer with NaN/MAX_INT — including output buffers of operations like
`a + b` that are immediately overwritten by their kernel. This adds an unnecessary
memset kernel per compiled op.

## Approach 1: Tag at codegen time via scheduler node type

The `AllocateLine` in `wrapper.py` already has access to
`V.graph.scheduler.current_node`. We check whether the node is a
`NopKernelSchedulerNode` (which corresponds to `torch.empty` and friends) and
store `is_uninitialized` on the `AllocateLine`. This flag is threaded through
`make_buffer_allocation` → `make_allocation` to gate the deterministic fill.

The change is entirely in codegen — four files touched (`wrapper.py`,
`cpp_wrapper_cpu.py`, `cpp_wrapper_cpu_array_ref.py`, and the test file).

## Approach 2: Add `is_uninitialized` to IR `ComputedBuffer`

An alternative is to set `is_uninitialized` on the IR node itself (e.g.,
`ComputedBuffer` or `Buffer`). The flag would be set during lowering —
`torch.empty` would produce a buffer with `is_uninitialized=True`, while ops
like `add` would default to `False`. The flag then propagates through scheduling
and into codegen where `AllocateLine` reads it from the buffer node.

This pushes the semantic distinction earlier in the pipeline (lowering/IR rather
than codegen) and makes the property available to any pass that inspects the IR,
not just the allocation codegen path.

…regression The previous default of True caused make_allocation callers that don't go through AllocateLine (e.g. memory_planning pool allocations) to incorrectly trigger the cpp_wrapper deterministic error. Only AllocateLine.codegen should opt in with True for NopKernelSchedulerNode buffers.

kshitij12345 · 2026-03-17T14:59:52Z

@pytorchbot rebase -b viable/strict

pytorchmergebot · 2026-03-17T15:01:35Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2026-03-17T15:01:37Z

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/174718/head returned non-zero exit code 1

Rebasing (1/17)
Rebasing (2/17)
Rebasing (3/17)
Rebasing (4/17)
Rebasing (5/17)
Rebasing (6/17)
Auto-merging test/inductor/test_torchinductor.py
CONFLICT (content): Merge conflict in test/inductor/test_torchinductor.py
Auto-merging torch/_inductor/codegen/wrapper.py
CONFLICT (content): Merge conflict in torch/_inductor/codegen/wrapper.py
error: could not apply c09c0ed167f... [inductor] fix allocation with deterministic guard
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply c09c0ed167f... # [inductor] fix allocation with deterministic guard

Raised by https://github.com/pytorch/pytorch/actions/runs/23200777095

Fixes pytorch#174386 Pull Request resolved: pytorch#174718 Approved by: https://github.com/eellison, https://github.com/mlazos

…74718)" This reverts commit 6d12a21. Reverted pytorch#174718 on behalf of https://github.com/eellison due to inadvertently causes fill_ on inductor allocations, not just user empty_strided ([comment](pytorch#174718 (comment)))

[inductor] fix allocation with deterministic guard

f3f6780

pytorch-bot bot added ciflow/inductor module: inductor labels Feb 10, 2026

pytorchbot added the open source label Feb 10, 2026

kshitij12345 added module: determinism topic: not user facing topic category labels Feb 10, 2026

kshitij12345 marked this pull request as ready for review February 10, 2026 22:47

kshitij12345 requested a review from bdhirsh February 10, 2026 22:47

bdhirsh requested review from eellison and mlazos February 10, 2026 23:20

bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 10, 2026

kshitij12345 mentioned this pull request Feb 11, 2026

torch.compile ignores torch.use_deterministic_algorithms(True) on empty_like #174386

Open

eellison reviewed Feb 11, 2026

View reviewed changes

kshitij12345 added 2 commits February 11, 2026 08:33

address review: skip fast path when deterministic allocation is expected

3fb3dd6

fix for pin_memory case

3cd0b2a

eellison reviewed Feb 11, 2026

View reviewed changes

check is_pinned in test

6c679f2

eellison approved these changes Feb 11, 2026

View reviewed changes

kshitij12345 added 4 commits February 12, 2026 01:08

[inductor] fix allocation with deterministic guard

c09c0ed

address review: skip fast path when deterministic allocation is expected

837aa17

fix for pin_memory case

1bddee8

check is_pinned in test

26a6731

pytorchmergebot force-pushed the inductor-empty-deterministic branch from 6c679f2 to 26a6731 Compare February 12, 2026 01:08

pytorchmergebot removed the merging label Mar 2, 2026

postmath pushed a commit to postmath/pytorch that referenced this pull request Mar 3, 2026

[inductor] fix allocation with deterministic guard (pytorch#174718)

0d0775e

Fixes pytorch#174386 Pull Request resolved: pytorch#174718 Approved by: https://github.com/eellison, https://github.com/mlazos

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Mar 12, 2026

pytorchmergebot reopened this Mar 12, 2026

Handle buffers created by operators

43275d5

pytorch-bot bot added ciflow/torchtitan Run TorchTitan integration tests release notes: inductor (aoti) labels Mar 17, 2026

kshitij12345 added 6 commits March 17, 2026 03:14

fix linter

64c12a1

explicit error for cpp_wrapper

64623dc

apply linter

881a539

fix linter and dont run test on macos

f081acc

Add guard for cpp_wrapper to test

0437834

Conversation

kshitij12345 commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174718

❌ 8 New Failures, 4 Unrelated Failures

Uh oh!

pytorch-bot bot commented Feb 10, 2026

This PR needs a release notes: label

Uh oh!

bdhirsh commented Feb 10, 2026

Uh oh!

eellison Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

krastogi-in commented Feb 11, 2026

Uh oh!

eellison Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kshitij12345 commented Feb 12, 2026

Uh oh!

pytorchmergebot commented Feb 12, 2026

Uh oh!

pytorchmergebot commented Feb 12, 2026

Uh oh!

eellison commented Mar 12, 2026

Uh oh!

pytorch-bot bot commented Mar 12, 2026

Uh oh!

eellison commented Mar 12, 2026

Uh oh!

pytorch-bot bot commented Mar 12, 2026

Uh oh!

eellison commented Mar 12, 2026

Uh oh!

pytorchmergebot commented Mar 12, 2026

Uh oh!

pytorchmergebot commented Mar 12, 2026

Uh oh!

eellison commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kshitij12345 commented Mar 13, 2026

Uh oh!

kshitij12345 commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kshitij12345 commented Mar 17, 2026

Uh oh!

pytorchmergebot commented Mar 17, 2026

Uh oh!

pytorchmergebot commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

kshitij12345 commented Feb 10, 2026 •

edited

Loading

pytorch-bot bot commented Feb 10, 2026 •

edited

Loading

This PR needs a `release notes:` label

kshitij12345 Feb 11, 2026 •

edited

Loading

eellison commented Mar 12, 2026 •

edited

Loading

kshitij12345 commented Mar 17, 2026 •

edited

Loading