[DTensor] Track per-output placements for multi-output ops in strategy validation by pianpwk · Pull Request #175893 · pytorch/pytorch

pianpwk · 2026-02-26T21:27:06Z

Stack from ghstack (oldest at bottom):

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
S(1) -> (P(min), P(min)) shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.

[ghstack-poisoned]

pytorch-bot · 2026-02-26T21:27:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175893

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (7 Unrelated Failures)

As of commit 4395c75 with merge base fbbcc93 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor / unit-test / inductor-test / test (inductor_cpp_wrapper, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
test/inductor/test_triton_kernels.py::TestUserKernelEpilogueFusion::test_fusion_custom_kernel_with_linebreaks
inductor / unit-test / inductor-test / test (inductor_cpp_wrapper, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
test/inductor/test_triton_kernels.py::TestUserKernelEpilogueFusion::test_fusion_custom_kernel_with_linebreaks

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / cuda13.0-py3.10-gcc11-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu) (gh) (trunk failure)
MISSING REGRESSION TEST
trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (default, 2, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) (gh) (trunk failure)
test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32
trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu) (gh) (trunk failure)
MISSING REGRESSION TEST
trunk / linux-jammy-rocm-py3.10 / test (default, 2, 6, linux.rocm.gpu.gfx950.1) (gh) (trunk failure)
test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable) (gh) (#174929)
detectron2_maskrcnn_r_50_fpn

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…y_validation For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. ghstack-source-id: 923aa60 Pull-Request: #175893

wconstab

add a test?

[ghstack-poisoned]

…y_validation For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. ghstack-source-id: cf4804a Pull-Request: #175893

pianpwk · 2026-02-28T00:51:30Z

@pytorchbot merge

pytorchmergebot · 2026-02-28T00:53:41Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-auto-revert · 2026-02-28T08:35:47Z

@pytorchbot revert -m "Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable" -c autorevert

This PR is attributed to have caused regression in:

pull: distributed/tensor/test_strategy_validation.py::test_abs_psum_psum_is_invalid (hud), distributed/tensor/test_strategy_validation.py::test_exhaustive_binary_op_rules (hud), distributed/tensor/test_strategy_validation.py::test_all_zero_output_false_positive (hud), distributed/tensor/test_strategy_validation.py::test_abs_pavg_pavg_is_invalid (hud), distributed/tensor/test_strategy_validation.py::test_argmin_rejects_pmax_input (hud), distributed/tensor/test_strategy_validation.py::test_add_alpha_negates_partial_max_to_min (hud), distributed/tensor/test_strategy_validation.py::test_decomp_discovers_rules_for_softplus (hud), distributed/tensor/test_strategy_validation.py::test_nan_output_valid_when_both_nan (hud)

Please investigate and fix the issues.
@claude Can you please read this revert comment, follow the links and read the errors, to then give a brief diagnostics on the cause of the error? If you judge the error to be legitimate reason for a revert, please provide brief guidance on how the author could fix it.

claude · 2026-02-28T08:36:18Z

Claude finished @pytorch-auto-revert[bot]'s task in 1m 12s —— View job

Investigating Revert of PR #175893

Read the CI failure logs to understand the errors
Read the changed files in the PR to understand what was modified
Diagnose the root cause
Provide guidance on fixing the issue

Diagnosis

The revert is legitimate. The root cause is straightforward: a type mismatch between the production code and the test code.

The PR changed PlacementCombination from a @dataclass (with named fields input_placements and output_placement) to a plain type alias:

PlacementCombination = tuple[tuple[Placement, ...], tuple[Placement, ...]]

However, the tests still construct PlacementCombination using the old dataclass-style keyword arguments and positional arguments:

# Old dataclass style - keyword args (no longer works):
combo = PlacementCombination(
    input_placements=(Shard(0), Replicate()), output_placement=Shard(0)
)

# Old dataclass style - positional args (no longer works):
combo = PlacementCombination(input_plcs, p_out)

Since PlacementCombination is now just tuple, Python raises:

TypeError: tuple() takes no keyword arguments — for the keyword-arg calls
TypeError: tuple expected at most 1 argument, got 2 — for the 2-positional-arg calls

All 8 failing tests hit the same error type.

How to Fix

The tests need to construct PlacementCombination as a plain tuple instead of using the old constructor syntax. There are also two additional issues to address:

Keyword-arg constructors (most tests): Change from PlacementCombination(input_placements=(...), output_placement=X) to a plain tuple. Note that the old field was output_placement (singular) but the new type expects a tuple of placements for multi-output support:
```
# Before (broken):
combo = PlacementCombination(
    input_placements=(Shard(0), Replicate()), output_placement=Shard(0)
)

# After (fixed):
combo = ((Shard(0), Replicate()), (Shard(0),))
```
Positional-arg constructors (test_exhaustive_binary_op_rules): Change from PlacementCombination(input_plcs, p_out) to a tuple with the output wrapped:
```
# Before (broken):
combo = PlacementCombination(input_plcs, p_out)

# After (fixed):
combo = (input_plcs, (p_out,))
```
The output_placement → output_placements (tuple) semantic change: The old output_placement was a single Placement. The new type expects tuple[Placement, ...] (a tuple of output placements, one per output tensor). For single-output ops, this means wrapping in a 1-tuple: (placement,).

pytorchmergebot · 2026-02-28T08:37:33Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

… strategy validation (#175893)" This reverts commit ea4d199. Reverted #175893 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](#175893 (comment)))

pytorchmergebot · 2026-02-28T08:37:42Z

@pianpwk your PR has been successfully reverted.

wconstab · 2026-02-28T18:31:04Z

Are the strategy validation tests not running on PRs to the strategy validation files? We should fix this, labeler.yml or whatever can be set to tag these files with the right label or else we move the tests to a more popular runner config

… in strategy validation" For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. [ghstack-poisoned]

[ghstack-poisoned]

pianpwk · 2026-03-08T04:58:35Z

@pytorchbot merge -i

pytorchmergebot · 2026-03-08T05:00:42Z

Merge started

Your change will be merged while ignoring the following 7 checks: pull / cuda13.0-py3.10-gcc11-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu), inductor / inductor-cpu-test / test (cpu_inductor_torchbench, 1, 2, linux.2xlarge.amx, unstable), inductor / unit-test / inductor-test / test (inductor_cpp_wrapper, 2, 2, linux.g5.4xlarge.nvidia.gpu), inductor / unit-test / inductor-test / test (inductor_cpp_wrapper, 1, 2, linux.g5.4xlarge.nvidia.gpu), trunk / linux-jammy-rocm-py3.10 / test (default, 2, 6, linux.rocm.gpu.gfx950.1), trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu), trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (default, 2, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Pull Request resolved: #176020 Approved by: https://github.com/wconstab ghstack dependencies: #175893

…y_validation For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. ghstack-source-id: ec85d75 Pull Request resolved: pytorch/pytorch#175893

…y_validation For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. ghstack-source-id: cf4804a Pull request resolved: pytorch/pytorch#175893 ghstack-source-id: 456408c Pull-Request: pytorch/pytorch#176361

…y validation (pytorch#175893) For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. Pull Request resolved: pytorch#175893 Approved by: https://github.com/wconstab ghstack dependencies: pytorch#175892

… strategy validation (pytorch#175893)" This reverts commit ea4d199. Reverted pytorch#175893 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](pytorch#175893 (comment)))

…y validation (pytorch#175893) For multi-output ops like aten.min.dim (returns values + indices), the tool now tracks each output's placement separately instead of using a single output placement for all outputs. This makes the display explicit: `S(1) -> (P(min), P(min))` shows both outputs get P(min). ComboKey changes from (inputs, single_output_str) to (inputs, output_strs_tuple). PlacementCombination is simplified to a plain type alias. normalize_combo_key normalizes each output against its own shape. Pull Request resolved: pytorch#175893 Approved by: https://github.com/wconstab

) Pull Request resolved: pytorch#176020 Approved by: https://github.com/wconstab ghstack dependencies: pytorch#175893

Update

151c3b9

[ghstack-poisoned]

pianpwk mentioned this pull request Feb 26, 2026

[DTensor] Report strategy_validation results per aten op variant #175892

Closed

pytorch-bot Bot added ciflow/inductor release notes: distributed (dtensor) release notes category labels Feb 26, 2026

pianpwk changed the title ~~[DTensor] Track per-output placements for multi-output ops in strategy_validation~~ [DTensor] Track per-output placements for multi-output ops in strategy validation Feb 26, 2026

wconstab approved these changes Feb 26, 2026

View reviewed changes

Update

4e52421

[ghstack-poisoned]

pianpwk marked this pull request as ready for review February 27, 2026 20:50

pianpwk mentioned this pull request Feb 27, 2026

[DTensor] skip zero-numel outputs for strategy validator #176020

Closed

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 28, 2026

pytorchmergebot added the merging label Feb 28, 2026

pytorchmergebot added the Merged label Feb 28, 2026

pytorchmergebot closed this in ea4d199 Feb 28, 2026

pytorchmergebot removed the merging label Feb 28, 2026

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Feb 28, 2026

pytorchmergebot reopened this Feb 28, 2026

pianpwk mentioned this pull request Mar 4, 2026

[DTensor] Track per-output placements for multi-output ops in strategy_validation #176361

Closed

pianpwk mentioned this pull request Mar 5, 2026

[shard prop] OpInfo strategy validation suite #176258

Open

Update

4395c75

[ghstack-poisoned]

pytorchmergebot added the merging label Mar 8, 2026

pytorchmergebot closed this in eee389a Mar 8, 2026

pytorchmergebot removed the merging label Mar 8, 2026

pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2026

[DTensor] skip zero-numel outputs for strategy validator (#176020)

9774102

Pull Request resolved: #176020 Approved by: https://github.com/wconstab ghstack dependencies: #175893

github-actions Bot deleted the gh/pianpwk/105/head branch April 8, 2026 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] Track per-output placements for multi-output ops in strategy validation#175893

[DTensor] Track per-output placements for multi-output ops in strategy validation#175893
pianpwk wants to merge 4 commits intogh/pianpwk/105/basefrom
gh/pianpwk/105/head

pianpwk commented Feb 26, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Feb 26, 2026 •

edited

Loading

Uh oh!

wconstab left a comment

Uh oh!

pianpwk commented Feb 28, 2026

Uh oh!

pytorchmergebot commented Feb 28, 2026

Uh oh!

pytorch-auto-revert Bot commented Feb 28, 2026

Uh oh!

claude Bot commented Feb 28, 2026 •

edited

Loading

Uh oh!

pytorchmergebot commented Feb 28, 2026

Uh oh!

pytorchmergebot commented Feb 28, 2026

Uh oh!

wconstab commented Feb 28, 2026

Uh oh!

pianpwk commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pianpwk commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175893

✅ You can merge normally! (7 Unrelated Failures)

Uh oh!

wconstab left a comment

Choose a reason for hiding this comment

Uh oh!

pianpwk commented Feb 28, 2026

Uh oh!

pytorchmergebot commented Feb 28, 2026

Merge started

Uh oh!

pytorch-auto-revert Bot commented Feb 28, 2026

Uh oh!

claude Bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Investigating Revert of PR #175893

Diagnosis

How to Fix

Uh oh!

pytorchmergebot commented Feb 28, 2026

Uh oh!

pytorchmergebot commented Feb 28, 2026

Uh oh!

wconstab commented Feb 28, 2026

Uh oh!

pianpwk commented Mar 8, 2026

Uh oh!

pytorchmergebot commented Mar 8, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pianpwk commented Feb 26, 2026 •

edited

Loading

pytorch-bot Bot commented Feb 26, 2026 •

edited

Loading

claude Bot commented Feb 28, 2026 •

edited

Loading