Skip to content

[DTensor] Track per-output placements for multi-output ops in strategy validation#175893

Closed
pianpwk wants to merge 4 commits intogh/pianpwk/105/basefrom
gh/pianpwk/105/head
Closed

[DTensor] Track per-output placements for multi-output ops in strategy validation#175893
pianpwk wants to merge 4 commits intogh/pianpwk/105/basefrom
gh/pianpwk/105/head

Conversation

@pianpwk
Copy link
Copy Markdown
Contributor

@pianpwk pianpwk commented Feb 26, 2026

Stack from ghstack (oldest at bottom):

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
S(1) -> (P(min), P(min)) shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 26, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175893

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (7 Unrelated Failures)

As of commit 4395c75 with merge base fbbcc93 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pianpwk added a commit that referenced this pull request Feb 26, 2026
…y_validation

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.


ghstack-source-id: 923aa60
Pull-Request: #175893
@pianpwk pianpwk changed the title [DTensor] Track per-output placements for multi-output ops in strategy_validation [DTensor] Track per-output placements for multi-output ops in strategy validation Feb 26, 2026
Copy link
Copy Markdown
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test?

[ghstack-poisoned]
pianpwk added a commit that referenced this pull request Feb 27, 2026
…y_validation

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.

ghstack-source-id: cf4804a
Pull-Request: #175893
@pianpwk pianpwk marked this pull request as ready for review February 27, 2026 20:50
@pianpwk
Copy link
Copy Markdown
Contributor Author

pianpwk commented Feb 28, 2026

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 28, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorch-auto-revert
Copy link
Copy Markdown

@pytorchbot revert -m "Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable" -c autorevert

This PR is attributed to have caused regression in:

Please investigate and fix the issues.
@claude Can you please read this revert comment, follow the links and read the errors, to then give a brief diagnostics on the cause of the error? If you judge the error to be legitimate reason for a revert, please provide brief guidance on how the author could fix it.

@claude
Copy link
Copy Markdown

claude Bot commented Feb 28, 2026

Claude finished @pytorch-auto-revert[bot]'s task in 1m 12s —— View job


Investigating Revert of PR #175893

  • Read the CI failure logs to understand the errors
  • Read the changed files in the PR to understand what was modified
  • Diagnose the root cause
  • Provide guidance on fixing the issue

Diagnosis

The revert is legitimate. The root cause is straightforward: a type mismatch between the production code and the test code.

The PR changed PlacementCombination from a @dataclass (with named fields input_placements and output_placement) to a plain type alias:

PlacementCombination = tuple[tuple[Placement, ...], tuple[Placement, ...]]

However, the tests still construct PlacementCombination using the old dataclass-style keyword arguments and positional arguments:

# Old dataclass style - keyword args (no longer works):
combo = PlacementCombination(
    input_placements=(Shard(0), Replicate()), output_placement=Shard(0)
)

# Old dataclass style - positional args (no longer works):
combo = PlacementCombination(input_plcs, p_out)

Since PlacementCombination is now just tuple, Python raises:

  • TypeError: tuple() takes no keyword arguments — for the keyword-arg calls
  • TypeError: tuple expected at most 1 argument, got 2 — for the 2-positional-arg calls

All 8 failing tests hit the same error type.

How to Fix

The tests need to construct PlacementCombination as a plain tuple instead of using the old constructor syntax. There are also two additional issues to address:

  1. Keyword-arg constructors (most tests): Change from PlacementCombination(input_placements=(...), output_placement=X) to a plain tuple. Note that the old field was output_placement (singular) but the new type expects a tuple of placements for multi-output support:

    # Before (broken):
    combo = PlacementCombination(
        input_placements=(Shard(0), Replicate()), output_placement=Shard(0)
    )
    
    # After (fixed):
    combo = ((Shard(0), Replicate()), (Shard(0),))
  2. Positional-arg constructors (test_exhaustive_binary_op_rules): Change from PlacementCombination(input_plcs, p_out) to a tuple with the output wrapped:

    # Before (broken):
    combo = PlacementCombination(input_plcs, p_out)
    
    # After (fixed):
    combo = (input_plcs, (p_out,))
  3. The output_placementoutput_placements (tuple) semantic change: The old output_placement was a single Placement. The new type expects tuple[Placement, ...] (a tuple of output placements, one per output tensor). For single-output ops, this means wrapping in a 1-tuple: (placement,).


@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Feb 28, 2026
… strategy validation (#175893)"

This reverts commit ea4d199.

Reverted #175893 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](#175893 (comment)))
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pianpwk your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Feb 28, 2026
@wconstab
Copy link
Copy Markdown
Contributor

Are the strategy validation tests not running on PRs to the strategy validation files? We should fix this, labeler.yml or whatever can be set to tag these files with the right label or else we move the tests to a more popular runner config

… in strategy validation"

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.

[ghstack-poisoned]
[ghstack-poisoned]
@pianpwk
Copy link
Copy Markdown
Contributor Author

pianpwk commented Mar 8, 2026

@pytorchbot merge -i

pytorchmergebot pushed a commit that referenced this pull request Mar 8, 2026
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
…y_validation

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.

ghstack-source-id: ec85d75
Pull Request resolved: pytorch/pytorch#175893
sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
…y_validation

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.

ghstack-source-id: cf4804a
Pull request resolved: pytorch/pytorch#175893


ghstack-source-id: 456408c
Pull-Request: pytorch/pytorch#176361
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…y validation (pytorch#175893)

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.
Pull Request resolved: pytorch#175893
Approved by: https://github.com/wconstab
ghstack dependencies: pytorch#175892
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
… strategy validation (pytorch#175893)"

This reverts commit ea4d199.

Reverted pytorch#175893 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](pytorch#175893 (comment)))
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…y validation (pytorch#175893)

For multi-output ops like aten.min.dim (returns values + indices), the
tool now tracks each output's placement separately instead of using a
single output placement for all outputs. This makes the display explicit:
`S(1) -> (P(min), P(min))` shows both outputs get P(min).

ComboKey changes from (inputs, single_output_str) to (inputs,
output_strs_tuple). PlacementCombination is simplified to a plain type
alias. normalize_combo_key normalizes each output against its own shape.
Pull Request resolved: pytorch#175893
Approved by: https://github.com/wconstab
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
@github-actions github-actions Bot deleted the gh/pianpwk/105/head branch April 8, 2026 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (dtensor) release notes category Reverted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants