Skip to content

[DTensor] single-dim strategy validation infra#172990

Closed
pianpwk wants to merge 2 commits intomainfrom
pianpwk/single_dim_ops_validate
Closed

[DTensor] single-dim strategy validation infra#172990
pianpwk wants to merge 2 commits intomainfrom
pianpwk/single_dim_ops_validate

Conversation

@pianpwk
Copy link
Copy Markdown
Contributor

@pianpwk pianpwk commented Jan 21, 2026

For any proposed (op, sample inputs, sharding prop rule: i.e. input -> output placements), you can rule out if this is a valid sharding prop rule with:

  1. Run the op on full_tensor inputs, to get the reference full_tensor output
  2. Get the local_tensor inputs, according to the input placements
  3. Run the local op
  4. Redistribute local outputs to Replicate, according to the output placements
  5. Check full_tensor shapes & numerics against reference

This runs this validation against OpInfo entries for aten ops with registered single-dim rules, enumerating all the strategies and replacing ShardPlaceholder -> Shard.

A future extension could be: by exhaustively enumerating all single-dim strategies on R/S/P, we could check if any prop rules are potentially missing. However this can see false positives (e.g. Partial with zero tensors), so likely that shouldn't be a hard error, and maybe shouldn't be a "test".

checked by turning on single-dim rules in

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jan 21, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/172990

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit f5bc08e with merge base 0f84569 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pianpwk pianpwk changed the title single-dim validation infra [DTensor] single-dim strategy validation infra Jan 22, 2026
@pianpwk pianpwk marked this pull request as ready for review January 22, 2026 00:20
@pianpwk pianpwk requested a review from anshul-si January 23, 2026 01:10
Copy link
Copy Markdown
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally sgtm. I also started working on a design for this, including the full enumeration and also testing existing non single dim rules. Let's discuss next week, could land this first and then extend

@pianpwk
Copy link
Copy Markdown
Contributor Author

pianpwk commented Jan 24, 2026

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 24, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

riccardofelluga pushed a commit to riccardofelluga/pytorch that referenced this pull request Jan 27, 2026
For any proposed (op, sample inputs, sharding prop rule: i.e. input -> output placements), you can rule out if this is a valid sharding prop rule with:
1. Run the op on full_tensor inputs, to get the reference full_tensor output
2. Get the local_tensor inputs, according to the input placements
3. Run the local op
4. Redistribute local outputs to Replicate, according to the output placements
5. Check full_tensor shapes & numerics against reference

This runs this validation against OpInfo entries for aten ops with registered single-dim rules, enumerating all the strategies and replacing ShardPlaceholder -> Shard.

A future extension could be: by exhaustively enumerating all single-dim strategies on R/S/P, we could check if any prop rules are potentially missing. However this can see false positives (e.g. Partial with zero tensors), so likely that shouldn't be a hard error, and maybe shouldn't be a "test".

checked by turning on single-dim rules in
- mm: https://github.com/pytorch/pytorch/blob/578744826f3011dcb14c0e437e709d7110559367/torch/distributed/tensor/_ops/_matrix_ops.py#L353
- pointwise ops: https://github.com/pytorch/pytorch/blob/578744826f3011dcb14c0e437e709d7110559367/torch/distributed/tensor/_ops/_pointwise_ops.py#L837-L839
- cat: https://github.com/pytorch/pytorch/blob/578744826f3011dcb14c0e437e709d7110559367/torch/distributed/tensor/_ops/_tensor_ops.py#L828
Pull Request resolved: pytorch#172990
Approved by: https://github.com/wconstab
@github-actions github-actions Bot deleted the pianpwk/single_dim_ops_validate branch February 24, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants