[DTensor] Add single-dim registration infra by wconstab · Pull Request #170359 · pytorch/pytorch

wconstab · 2025-12-13T01:14:14Z

Stack from ghstack (oldest at bottom):

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration.

Notes:

I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies.
I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost

[ghstack-poisoned]

pytorch-bot · 2025-12-13T01:14:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/170359

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 7212664 with merge base 1984725 ():

NEW FAILURE - The following job has failed:

trunk / linux-jammy-rocm-py3.10 / test (default, 2, 6, linux.rocm.gpu.gfx942.1) (gh)
/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:765:2: error: #endif without #if

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/distributed/tensor/_sharding_prop.py

[ghstack-poisoned]

ghstack-source-id: a25f465 Pull Request resolved: #170359

[ghstack-poisoned]

ghstack-source-id: ded0a40 Pull Request resolved: #170359

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost [ghstack-poisoned]

ghstack-source-id: a35cfe9 Pull Request resolved: #170359

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost [ghstack-poisoned]

wconstab · 2025-12-18T14:09:25Z

@pytorchbot merge

pytorchmergebot · 2025-12-18T14:11:29Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost Pull Request resolved: pytorch#170359 Approved by: https://github.com/weifengpy ghstack dependencies: pytorch#170615, pytorch#167677 Co-authored-by: Pian Pawakapan <pianpwk@meta.com>

Enforce tensor_meta is not none for new single-dim rules. Allow tensor_meta to continue to be None for existing rules for now. We should consider in the future asserting tensor_meta is required in DTensorSpec, but for now we just try to limit the bleeding. Pull Request resolved: #170827 Approved by: https://github.com/dolpm ghstack dependencies: #170615, #167677, #170359

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost Pull Request resolved: #170359 Approved by: https://github.com/weifengpy Co-authored-by: Pian Pawakapan <pianpwk@meta.com>

This reverts commit 32d0782. Reverted #170359 on behalf of https://github.com/jeanschmidt due to Required to revert #167677 that is required to revert #170615 that is required to revert #170030 ([comment](#170359 (comment)))

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost Pull Request resolved: #170359 Approved by: https://github.com/weifengpy ghstack dependencies: #170615, #167677 Co-authored-by: Pian Pawakapan <pianpwk@meta.com>

Enforce tensor_meta is not none for new single-dim rules. Allow tensor_meta to continue to be None for existing rules for now. We should consider in the future asserting tensor_meta is required in DTensorSpec, but for now we just try to limit the bleeding. Pull Request resolved: #170827 Approved by: https://github.com/dolpm ghstack dependencies: #170615, #167677, #170359

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost Pull Request resolved: pytorch#170359 Approved by: https://github.com/weifengpy Co-authored-by: Pian Pawakapan <pianpwk@meta.com>

This reverts commit 32d0782. Reverted pytorch#170359 on behalf of https://github.com/jeanschmidt due to Required to revert pytorch#167677 that is required to revert pytorch#170615 that is required to revert pytorch#170030 ([comment](pytorch#170359 (comment)))

This PR adds the register_single_dim_strategy util, and hooks it up to sharding_propagator. It also tests the registration. Notes: * I didn't yet decide how multiple registrations should be handled. I was planning to make it an error if you register twice for the same op for either single_dim or regular strategies. * I took the cleanest path of integration for now in sharding_prop, reusing as much code as possible with the existing 'op_strategy' case. I may have to change this later when integrating find_min_cost Pull Request resolved: pytorch#170359 Approved by: https://github.com/weifengpy ghstack dependencies: pytorch#170615, pytorch#167677 Co-authored-by: Pian Pawakapan <pianpwk@meta.com>

Enforce tensor_meta is not none for new single-dim rules. Allow tensor_meta to continue to be None for existing rules for now. We should consider in the future asserting tensor_meta is required in DTensorSpec, but for now we just try to limit the bleeding. Pull Request resolved: pytorch#170827 Approved by: https://github.com/dolpm ghstack dependencies: pytorch#170615, pytorch#167677, pytorch#170359

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…2150) gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 Pull Request resolved: #172150 Approved by: https://github.com/wconstab

… for matmul single dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…le dim" gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy #170359 [ghstack-poisoned]

…orch#172150) gen_einsum_strategies inserts replicate strategy first: https://github.com/pytorch/pytorch/blob/74b6a0efa359722def4b585d9d91fbc3a4bfa530/torch/distributed/tensor/_ops/_einsum_strategy.py#L121-L122 _select_min_cost_strategy choose Replicate at equal cost This PR makes sure consistent matmul results after switching to single dim strategy pytorch#170359 Pull Request resolved: pytorch#172150 Approved by: https://github.com/wconstab

ghstack-source-id: abba53d Pull Request resolved: pytorch/pytorch#170359

ghstack-source-id: 6fac59a Pull Request resolved: pytorch/pytorch#170359

[DTensor] Add single-dim registration infra

cb69aa1

[ghstack-poisoned]

This was referenced Dec 13, 2025

[DTensor] Add OpSchema.args_meta, kwargs_meta helpers #170358

Closed

[DTensor] Single Dim Strategy infra #167677

Closed

pytorch-bot bot added the ciflow/inductor label Dec 13, 2025

This was referenced Dec 13, 2025

[DTensor] Add Dijkstra-based single-dim strategy search #169438

Closed

[DTensor] single-dim pointwise strategy #168115

Closed

WIP foreach #170261

Closed

wconstab commented Dec 13, 2025

View reviewed changes

torch/distributed/tensor/_sharding_prop.py Outdated Show resolved Hide resolved

pianpwk mentioned this pull request Dec 17, 2025

[DTensor] single-dim foreach strategy #170631

Closed

Update

b9cea39

[ghstack-poisoned]

wconstab added a commit that referenced this pull request Dec 17, 2025

[DTensor] Add single-dim registration infra

ea66b4c

ghstack-source-id: a25f465 Pull Request resolved: #170359

wconstab added 2 commits December 17, 2025 14:33

Update on "[DTensor] Add single-dim registration infra"

3d9107d

[ghstack-poisoned]

Update on "[DTensor] Add single-dim registration infra"

3d76d9a

[ghstack-poisoned]

weifengpy mentioned this pull request Dec 17, 2025

[DTensor] matmul with strided shard input #170716

Open

Update on "[DTensor] Add single-dim registration infra"

80b1685

[ghstack-poisoned]

wconstab added a commit that referenced this pull request Dec 18, 2025

[DTensor] Add single-dim registration infra

1d36041

ghstack-source-id: ded0a40 Pull Request resolved: #170359

weifengpy approved these changes Dec 18, 2025

View reviewed changes

wconstab added the release notes: distributed (dtensor) release notes category label Dec 18, 2025

wconstab added a commit that referenced this pull request Dec 18, 2025

[DTensor] Add single-dim registration infra

8ae1fa5

ghstack-source-id: a35cfe9 Pull Request resolved: #170359

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 18, 2025

pytorchmergebot added the merging label Dec 18, 2025

pytorchmergebot closed this in 32d0782 Dec 18, 2025

pytorchmergebot added Merged and removed merging labels Dec 18, 2025

weifengpy mentioned this pull request Jan 13, 2026

[DTensor] insert Replicate at the begining for matmul single dim #172150

Closed

github-actions bot deleted the gh/wconstab/475/head branch January 18, 2026 02:21

SergeyTyshkevich pushed a commit to SergeyTyshkevich/chart2 that referenced this pull request Jan 19, 2026

[DTensor] Add single-dim registration infra

19ccf3f

ghstack-source-id: abba53d Pull Request resolved: pytorch/pytorch#170359

SergeyTyshkevich pushed a commit to SergeyTyshkevich/chart2 that referenced this pull request Jan 19, 2026

[DTensor] Add single-dim registration infra

1ba0658

ghstack-source-id: 6fac59a Pull Request resolved: pytorch/pytorch#170359

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] Add single-dim registration infra#170359

[DTensor] Add single-dim registration infra#170359
wconstab wants to merge 8 commits intogh/wconstab/475/basefrom
gh/wconstab/475/head

wconstab commented Dec 13, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

wconstab commented Dec 18, 2025

Uh oh!

pytorchmergebot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

wconstab commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/170359

❌ 1 New Failure

Uh oh!

Uh oh!

wconstab commented Dec 18, 2025

Uh oh!

pytorchmergebot commented Dec 18, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wconstab commented Dec 13, 2025 •

edited

Loading

pytorch-bot bot commented Dec 13, 2025 •

edited

Loading