[DTensor] Make RedistributionPlanner handle all partials by wconstab · Pull Request #172479 · pytorch/pytorch

wconstab · 2026-01-14T19:41:49Z

Stack from ghstack (oldest at bottom):

Previously, the planner hardcodes psum and ignores other partials. This means if we tried to redistribute to pavg or pmax we'd fail.

Changes Made

torch/distributed/tensor/_redistribute.py

Added partial_reduce_ops_in_target field (line 313):

Added a new instance variable partial_reduce_ops_in_target: set[str] = set() to track which Partial reduce ops are present in the src/dst placements.

Modified reduce op collection (lines 749-754):

Added code to collect Partial reduce ops from both src and dst placements when planning the redistribution. This ensures only relevant reduce ops are considered.

Updated R->P transition generation (lines 536-552):

Changed the hardcoded ("sum", "avg") to use self.partial_reduce_ops_in_target, which dynamically considers only the reduce ops present in the src/dst placements.

test/distributed/tensor/test_redistribute.py

Added test_replicate_to_partial_different_reduce_ops (lines 903-950):

Tests that R->P transitions work correctly for all reduce op types (sum, avg, min, max).
Verifies the local tensor content is correct based on the reduce_op semantics.

Added test_replicate_to_partial_planner_reduce_op_collection (lines 952-1054):

Tests that the planner correctly collects reduce ops from src/dst placements.
Verifies the optimization that avoids naively expanding the graph to include all reduce op types.
Tests three scenarios: R->P("min"), P("max")->R, and multi-dimensional meshes with multiple Partial types.

Key Benefits

Dynamic reduce op handling: The planner now considers only reduce ops present in the actual redistribution request, rather than hardcoding specific reduce ops.
No unnecessary graph expansion: By only considering relevant reduce ops, the graph-based search avoids exploring paths that aren't needed.
Full reduce op support: All reduce op types (sum, avg, min, max, etc.) are now supported for R->P transitions, not just sum and avg.

…nd this [ghstack-poisoned]

pytorch-bot · 2026-01-14T19:41:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/172479

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 6339cf2 with merge base 7754b55 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor / inductor-test / test (inductor_torchbench, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
sam
inductor / unit-test / inductor-test / test (inductor, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float32

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… want to land this" [ghstack-poisoned]

…nd this ghstack-source-id: 47b2fb5 Pull Request resolved: pytorch/pytorch#172479

… want to land this" [ghstack-poisoned]

…nd this ghstack-source-id: 451921e Pull Request resolved: pytorch/pytorch#172479

… want to land this" [ghstack-poisoned]

Previously, the planner hardcodes psum and ignores other partials. This means if we tried to redistribute to pavg or pmax we'd fail. Changes Made 1. torch/distributed/tensor/_redistribute.py Added partial_reduce_ops_in_target field (line 313): - Added a new instance variable partial_reduce_ops_in_target: set[str] = set() to track which Partial reduce ops are present in the src/dst placements. Modified reduce op collection (lines 749-754): - Added code to collect Partial reduce ops from both src and dst placements when planning the redistribution. This ensures only relevant reduce ops are considered. Updated R->P transition generation (lines 536-552): - Changed the hardcoded ("sum", "avg") to use self.partial_reduce_ops_in_target, which dynamically considers only the reduce ops present in the src/dst placements. 2. test/distributed/tensor/test_redistribute.py Added test_replicate_to_partial_different_reduce_ops (lines 903-950): - Tests that R->P transitions work correctly for all reduce op types (sum, avg, min, max). - Verifies the local tensor content is correct based on the reduce_op semantics. Added test_replicate_to_partial_planner_reduce_op_collection (lines 952-1054): - Tests that the planner correctly collects reduce ops from src/dst placements. - Verifies the optimization that avoids naively expanding the graph to include all reduce op types. - Tests three scenarios: R->P("min"), P("max")->R, and multi-dimensional meshes with multiple Partial types. Key Benefits 1. Dynamic reduce op handling: The planner now considers only reduce ops present in the actual redistribution request, rather than hardcoding specific reduce ops. 2. No unnecessary graph expansion: By only considering relevant reduce ops, the graph-based search avoids exploring paths that aren't needed. 3. Full reduce op support: All reduce op types (sum, avg, min, max, etc.) are now supported for R->P transitions, not just sum and avg. [ghstack-poisoned]

Previously, the planner hardcodes psum and ignores other partials. This means if we tried to redistribute to pavg or pmax we'd fail. Changes Made 1. torch/distributed/tensor/_redistribute.py Added partial_reduce_ops_in_target field (line 313): - Added a new instance variable partial_reduce_ops_in_target: set[str] = set() to track which Partial reduce ops are present in the src/dst placements. Modified reduce op collection (lines 749-754): - Added code to collect Partial reduce ops from both src and dst placements when planning the redistribution. This ensures only relevant reduce ops are considered. Updated R->P transition generation (lines 536-552): - Changed the hardcoded ("sum", "avg") to use self.partial_reduce_ops_in_target, which dynamically considers only the reduce ops present in the src/dst placements. 2. test/distributed/tensor/test_redistribute.py Added test_replicate_to_partial_different_reduce_ops (lines 903-950): - Tests that R->P transitions work correctly for all reduce op types (sum, avg, min, max). - Verifies the local tensor content is correct based on the reduce_op semantics. Added test_replicate_to_partial_planner_reduce_op_collection (lines 952-1054): - Tests that the planner correctly collects reduce ops from src/dst placements. - Verifies the optimization that avoids naively expanding the graph to include all reduce op types. - Tests three scenarios: R->P("min"), P("max")->R, and multi-dimensional meshes with multiple Partial types. Key Benefits 1. Dynamic reduce op handling: The planner now considers only reduce ops present in the actual redistribution request, rather than hardcoding specific reduce ops. 2. No unnecessary graph expansion: By only considering relevant reduce ops, the graph-based search avoids exploring paths that aren't needed. 3. Full reduce op support: All reduce op types (sum, avg, min, max, etc.) are now supported for R->P transitions, not just sum and avg. ghstack-source-id: 46e012e Pull Request resolved: #172479

zpcore · 2026-01-26T20:52:35Z

+        # present in the redistribution, avoiding unnecessary graph expansion.
+        for placement in itertools.chain(src_placements, dst_placements):
+            if isinstance(placement, Partial):
+                self.partial_reduce_ops_in_target.add(placement.reduce_op)


Shall we only support sum and avg for now? Or the order will matter and generate wrong result. For now, we can error out if there is Partial min, max etc in the src/dest placement.

I think in order to support the order for Partial, it should be roughly something like below, where we map Partial to [(Partial type, mesh dim), ... ]:

Partial: [(sum, 1), (max, 0), (sum, 2)]

, and apply the push/pop rule. This can be a future work.

i'm confused. you remember that we agreed to ban mixed partials right? #172609

Maybe your point is still valid- if we have Pmax in src and Psum in dst, i guess that is not strictly a 'mixed partial' situation. Will this situation require careful ordering in the graph search? Let's figure out a good test case to use for this.

Yes, "Pmax in src and Psum in dst" will cause mixed Partial case during redistribution. Even though the src and dst doesn't have mixed Partial.

Will this situation require careful ordering in the graph search?

I think so. Maybe the easiest way is to support only sum and avg in src/dst placements for now.

🤔 but don't we already support redistributing e.g. R->PMax in redistribute_local_tensor via greedy path? Then this leaves an implementation gap in the planner?

I think there is the gap. In greedy generated path, we will never have the mixed Partial during redistribution even though the src contains Partial(sum) and the dst contains Partial(max).

OK, I have a simple way to bridge the gap for graph based solution: Forbid entering a state with placements contains mixed Partial (https://github.com/pytorch/pytorch/pull/172479/changes#r2729869743). Then we can support Partial(max) in the dst placement! And for sure we can find a path without mixed Partial as long as the src and dst doesn't contain mixed Partial.

zpcore · 2026-01-27T01:19:49Z

-            )
+            for reduce_op in self.partial_reduce_ops_in_target:
+                new_placements = list(placements)
+                new_placements[mesh_dim] = Partial(reduce_op)


We can prevent mixed Partial here:

if len(set(p for p in new_placements if p.is_partial()))>1: continue

In this way we can have different Partial in src and dst.

Previously, the planner hardcodes psum and ignores other partials. This means if we tried to redistribute to pavg or pmax we'd fail. Changes Made 1. torch/distributed/tensor/_redistribute.py Added partial_reduce_ops_in_target field (line 313): - Added a new instance variable partial_reduce_ops_in_target: set[str] = set() to track which Partial reduce ops are present in the src/dst placements. Modified reduce op collection (lines 749-754): - Added code to collect Partial reduce ops from both src and dst placements when planning the redistribution. This ensures only relevant reduce ops are considered. Updated R->P transition generation (lines 536-552): - Changed the hardcoded ("sum", "avg") to use self.partial_reduce_ops_in_target, which dynamically considers only the reduce ops present in the src/dst placements. 2. test/distributed/tensor/test_redistribute.py Added test_replicate_to_partial_different_reduce_ops (lines 903-950): - Tests that R->P transitions work correctly for all reduce op types (sum, avg, min, max). - Verifies the local tensor content is correct based on the reduce_op semantics. Added test_replicate_to_partial_planner_reduce_op_collection (lines 952-1054): - Tests that the planner correctly collects reduce ops from src/dst placements. - Verifies the optimization that avoids naively expanding the graph to include all reduce op types. - Tests three scenarios: R->P("min"), P("max")->R, and multi-dimensional meshes with multiple Partial types. Key Benefits 1. Dynamic reduce op handling: The planner now considers only reduce ops present in the actual redistribution request, rather than hardcoding specific reduce ops. 2. No unnecessary graph expansion: By only considering relevant reduce ops, the graph-based search avoids exploring paths that aren't needed. 3. Full reduce op support: All reduce op types (sum, avg, min, max, etc.) are now supported for R->P transitions, not just sum and avg. [ghstack-poisoned]

zpcore

LGTM!

wconstab · 2026-01-27T22:37:23Z

@pytorchbot merge

pytorchmergebot · 2026-01-27T22:39:32Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Inplace ops for dtensor have a restriction: you're not allowed to redistribute the 'inplace' tensor. This means in some cases, sharding propagation has to fail becuase the inplace input is not compatible with any of the possible sharding strategies. This PR makes sure this case raises the expected informative error rather than a confusing error about selecting min cost over an empty sharding strategies list. Pull Request resolved: #173572 Approved by: https://github.com/pianpwk ghstack dependencies: #172479

Pull Request resolved: #173567 Approved by: https://github.com/tianyu-l, https://github.com/fegin ghstack dependencies: #172479, #173572

) Pull Request resolved: pytorch#173567 Approved by: https://github.com/tianyu-l, https://github.com/fegin ghstack dependencies: pytorch#172479, pytorch#173572

Previously, the planner hardcodes psum and ignores other partials. This means if we tried to redistribute to pavg or pmax we'd fail. Changes Made 1. torch/distributed/tensor/_redistribute.py Added partial_reduce_ops_in_target field (line 313): - Added a new instance variable partial_reduce_ops_in_target: set[str] = set() to track which Partial reduce ops are present in the src/dst placements. Modified reduce op collection (lines 749-754): - Added code to collect Partial reduce ops from both src and dst placements when planning the redistribution. This ensures only relevant reduce ops are considered. Updated R->P transition generation (lines 536-552): - Changed the hardcoded ("sum", "avg") to use self.partial_reduce_ops_in_target, which dynamically considers only the reduce ops present in the src/dst placements. 2. test/distributed/tensor/test_redistribute.py Added test_replicate_to_partial_different_reduce_ops (lines 903-950): - Tests that R->P transitions work correctly for all reduce op types (sum, avg, min, max). - Verifies the local tensor content is correct based on the reduce_op semantics. Added test_replicate_to_partial_planner_reduce_op_collection (lines 952-1054): - Tests that the planner correctly collects reduce ops from src/dst placements. - Verifies the optimization that avoids naively expanding the graph to include all reduce op types. - Tests three scenarios: R->P("min"), P("max")->R, and multi-dimensional meshes with multiple Partial types. Key Benefits 1. Dynamic reduce op handling: The planner now considers only reduce ops present in the actual redistribution request, rather than hardcoding specific reduce ops. 2. No unnecessary graph expansion: By only considering relevant reduce ops, the graph-based search avoids exploring paths that aren't needed. 3. Full reduce op support: All reduce op types (sum, avg, min, max, etc.) are now supported for R->P transitions, not just sum and avg. ghstack-source-id: e40beab Pull Request resolved: pytorch/pytorch#172479

WIP fix error on redistribute_cost for R->Pavg, not sure i want to la…

fc021da

…nd this [ghstack-poisoned]

wconstab mentioned this pull request Jan 14, 2026

[DTensor] single_dim fix symint + _create_expanded_strategy #172421

Closed

wconstab mentioned this pull request Jan 14, 2026

[DTensor] single dim fix inplace op expansion #172477

Closed

pytorch-bot Bot added ciflow/inductor release notes: distributed (dtensor) release notes category labels Jan 14, 2026

wconstab added 3 commits January 14, 2026 16:23

Update on "WIP fix error on redistribute_cost for R->Pavg, not sure i…

a15bc4b

… want to land this" [ghstack-poisoned]

Update on "WIP fix error on redistribute_cost for R->Pavg, not sure i…

1862398

… want to land this" [ghstack-poisoned]

Update on "WIP fix error on redistribute_cost for R->Pavg, not sure i…

b2b650b

… want to land this" [ghstack-poisoned]

SergeyTyshkevich pushed a commit to SergeyTyshkevich/chart2 that referenced this pull request Jan 19, 2026

WIP fix error on redistribute_cost for R->Pavg, not sure i want to la…

66b011e

…nd this ghstack-source-id: 47b2fb5 Pull Request resolved: pytorch/pytorch#172479

wconstab added 2 commits January 19, 2026 20:16

Update on "WIP fix error on redistribute_cost for R->Pavg, not sure i…

c3e62f1

… want to land this" [ghstack-poisoned]

Update on "WIP fix error on redistribute_cost for R->Pavg, not sure i…

2e95d99

… want to land this" [ghstack-poisoned]

suncapitalllc007-star pushed a commit to suncapitalllc007-star/pytorch that referenced this pull request Jan 25, 2026

WIP fix error on redistribute_cost for R->Pavg, not sure i want to la…

8490644

…nd this ghstack-source-id: 451921e Pull Request resolved: pytorch/pytorch#172479

wconstab added 2 commits January 25, 2026 20:01

Update on "WIP fix error on redistribute_cost for R->Pavg, not sure i…

56b13e3

… want to land this" [ghstack-poisoned]

wconstab changed the title ~~WIP fix error on redistribute_cost for R->Pavg, not sure i want to land this~~ [DTensor] Make RedistributionPlanner handle all partials Jan 26, 2026

wconstab requested a review from zpcore January 26, 2026 20:09

zpcore reviewed Jan 26, 2026

View reviewed changes

zpcore reviewed Jan 27, 2026

View reviewed changes

wconstab added 2 commits January 26, 2026 19:48

zpcore approved these changes Jan 27, 2026

View reviewed changes

This was referenced Jan 27, 2026

[DTensor] Update TP api to support single-dim strategies #173567

Closed

[DTensor] single-dim expander raises clear inplace error #173572

Closed

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 27, 2026

pytorchmergebot added the merging label Jan 27, 2026

pytorchmergebot added the Merged label Jan 28, 2026

pytorchmergebot closed this in b894ed2 Jan 28, 2026

pytorchmergebot removed the merging label Jan 28, 2026

wconstab mentioned this pull request Jan 28, 2026

[DTensor] expand_to_full_mesh_op_strategy filters mixed partials #173614

Closed

pytorchmergebot pushed a commit that referenced this pull request Jan 28, 2026

[DTensor] Update TP api to support single-dim strategies (#173567)

b24939e

Pull Request resolved: #173567 Approved by: https://github.com/tianyu-l, https://github.com/fegin ghstack dependencies: #172479, #173572

github-actions Bot deleted the gh/wconstab/501/head branch February 27, 2026 02:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] Make RedistributionPlanner handle all partials#172479

[DTensor] Make RedistributionPlanner handle all partials#172479
wconstab wants to merge 10 commits intogh/wconstab/501/basefrom
gh/wconstab/501/head

wconstab commented Jan 14, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Jan 14, 2026 •

edited

Loading

Uh oh!

zpcore Jan 26, 2026

Uh oh!

wconstab Jan 26, 2026

Uh oh!

zpcore Jan 26, 2026 •

edited

Loading

Uh oh!

wconstab Jan 27, 2026

Uh oh!

zpcore Jan 27, 2026 •

edited

Loading

Uh oh!

zpcore Jan 27, 2026 •

edited

Loading

Uh oh!

zpcore left a comment

Uh oh!

wconstab commented Jan 27, 2026

Uh oh!

pytorchmergebot commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wconstab commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/172479

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

zpcore Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

zpcore Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wconstab Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

zpcore Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zpcore Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zpcore left a comment

Choose a reason for hiding this comment

Uh oh!

wconstab commented Jan 27, 2026

Uh oh!

pytorchmergebot commented Jan 27, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wconstab commented Jan 14, 2026 •

edited

Loading

pytorch-bot Bot commented Jan 14, 2026 •

edited

Loading

zpcore Jan 26, 2026 •

edited

Loading

zpcore Jan 27, 2026 •

edited

Loading

zpcore Jan 27, 2026 •

edited

Loading