[DTensor] Strategy Validation (4/4): Multi-output ops by wconstab · Pull Request #174995 · pytorch/pytorch

wconstab · 2026-02-13T21:32:09Z

Stack from ghstack (oldest at bottom):

Support multi-output ops like split, unbind, topk, sort.

Tested for these ops and things look reasonable (not an exhaustive test
of all multi-output ops):

unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos).
topk: 14 true positives, 0 false positives
sort: 102 true positives, 0 false positives
split_with_sizes: 24 true positives, 0 false positives
chunk: 18 true positives, 0 false positives

No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and
multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks).

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). [ghstack-poisoned]

pytorch-bot · 2026-02-13T21:32:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174995

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 6e797cf with merge base 003e05b ():

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

inductor / inductor-test / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu, unstable) (gh)
pytorch_CycleGAN_and_pix2pix
inductor / inductor-test-cuda13 / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (#174930)
pytorch_CycleGAN_and_pix2pix

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). ghstack-source-id: 1b6f484 Pull Request resolved: #174995

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). [ghstack-poisoned]

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). ghstack-source-id: 942080a Pull Request resolved: #174995

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). [ghstack-poisoned]

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). ghstack-source-id: abe3182 Pull Request resolved: #174995

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). [ghstack-poisoned]

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). ghstack-source-id: 63dd244 Pull Request resolved: #174995

zpcore · 2026-02-17T18:45:38Z

+            if isinstance(combination.output_placement, Replicate):
+                local_values = [local_out._local_tensors[r] for r in range(world_size)]
+                all_same = all(
+                    torch.allclose(local_values[0], lv, atol=1e-5, rtol=1e-5)
+                    for lv in local_values[1:]
+                )
+                if not all_same:
+                    return (
+                        False,
+                        f"Replicate output[{i}] but local values differ across ranks",
+                    )


Do we need this part? I think we can use the same logic below regardless of the combinations.output_placement.

this looks like an additional check that forces that local values are the same across ranks before redistribute, rather than just checking that after redistribute to full_tensor the rank0 value is correct. I guess it is possible this is a useful check, but i don't see how it is specifically related to the multi-output support. i'm asking claude to explain it

oh, it was an existing check, just moved inside the for loop over ground truths. i think this is OK

I think if we remove this part, everything should still work.

zpcore · 2026-02-17T18:49:48Z

-        output_plc = spec.output_spec.placements[0]
+        if isinstance(spec.output_specs, tuple):
+            first_output_spec = spec.output_specs[0]
+            if first_output_spec is None:


In which case the first_output_spec is None? I thought you already assume if the output is tuple, it should be tuple[Tensor].

yea, i am gonna harden _is_tensor_output so it asserts if there is a mixture of tensor and non-tensor types in a tuple. this will make this check unnecessary for now, but later if we want to support such an op (e.g. SDPA) that may return none, then we need to change this logic.

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). [ghstack-poisoned]

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). ghstack-source-id: 6b77770 Pull Request resolved: #174995

wconstab · 2026-02-18T00:16:54Z

@pytorchbot merge

pytorchmergebot · 2026-02-18T00:19:13Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). Pull Request resolved: #174995 Approved by: https://github.com/pianpwk, https://github.com/zpcore ghstack dependencies: #174799, #174800

Support multi-output ops like split, unbind, topk, sort. Tested for these ops and things look reasonable (not an exhaustive test of all multi-output ops): - unbind: 0 true positives because its strategy unshards the unbind dimension, so all non-trivial rules involve Replicate inputs → skipped. This is correct behavior (the validator only tests non-fully-replicated combos). - topk: 14 true positives, 0 false positives - sort: 102 true positives, 0 false positives - split_with_sizes: 24 true positives, 0 false positives - chunk: 18 true positives, 0 false positives No unexpected issues with any of the multi-output operators. The implementation handles all of them correctly — single-output and multi-output ops with varying tuple sizes (unbind's dynamic N outputs, topk/sort's 2-element tuples, split's variable chunks). Pull Request resolved: pytorch#174995 Approved by: https://github.com/pianpwk, https://github.com/zpcore ghstack dependencies: pytorch#174799, pytorch#174800

This was referenced Feb 13, 2026

[DTensor] Strategy Validation (2/3): partial input creation and validation engine #174799

Closed

[DTensor] Strategy Validation (3/3): strategy querying, orchestrator, and CLI #174800

Closed

pytorch-bot Bot added ciflow/inductor release notes: distributed (dtensor) release notes category labels Feb 13, 2026

wconstab requested review from pianpwk and zpcore February 17, 2026 17:57

pianpwk approved these changes Feb 17, 2026

View reviewed changes

zpcore reviewed Feb 17, 2026

View reviewed changes

zpcore approved these changes Feb 17, 2026

View reviewed changes

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 18, 2026

pytorchmergebot added the merging label Feb 18, 2026

pytorchmergebot added the Merged label Feb 18, 2026

pytorchmergebot closed this in fb9623f Feb 18, 2026

pytorchmergebot removed the merging label Feb 18, 2026

github-actions Bot deleted the gh/wconstab/533/head branch March 20, 2026 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] Strategy Validation (4/4): Multi-output ops#174995

[DTensor] Strategy Validation (4/4): Multi-output ops#174995
wconstab wants to merge 5 commits intogh/wconstab/533/basefrom
gh/wconstab/533/head

wconstab commented Feb 13, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

zpcore Feb 17, 2026

Uh oh!

wconstab Feb 17, 2026

Uh oh!

wconstab Feb 17, 2026

Uh oh!

zpcore Feb 17, 2026

Uh oh!

zpcore Feb 17, 2026

Uh oh!

wconstab Feb 17, 2026

Uh oh!

wconstab commented Feb 18, 2026

Uh oh!

pytorchmergebot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

wconstab commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/174995

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

zpcore Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

zpcore Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

zpcore Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab commented Feb 18, 2026

Uh oh!

pytorchmergebot commented Feb 18, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wconstab commented Feb 13, 2026 •

edited

Loading

pytorch-bot Bot commented Feb 13, 2026 •

edited

Loading