Skip to content

[DeviceMesh] Use _flatten_rank_map to replace _flatten_mesh_list so that we don't need to compare root mesh (#166003)#166264

Closed
fduwjj wants to merge 1 commit intopytorch:mainfrom
fduwjj:export-D85526705
Closed

[DeviceMesh] Use _flatten_rank_map to replace _flatten_mesh_list so that we don't need to compare root mesh (#166003)#166264
fduwjj wants to merge 1 commit intopytorch:mainfrom
fduwjj:export-D85526705

Conversation

@fduwjj
Copy link
Contributor

@fduwjj fduwjj commented Oct 26, 2025

Summary:

Since we are already share a flattened tensor _rank_map across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in #164510 and further simply the code.

We do have a more ambitious universe-based change here: #165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci

imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj

cc @H-Huang @awgu @wanchaol @fegin @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci

…hat we don't need to compare root mesh (pytorch#166003)

Summary:


Since we are already share a flattened tensor `_rank_map` across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in pytorch#164510 and further simply the code.

We do have a more ambitious universe-based change here: pytorch#165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci


imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166264

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 71f14a1 with merge base a2b6afe (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 26, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 26, 2025

@fduwjj has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85526705.

@fduwjj fduwjj requested review from fegin and lw October 26, 2025 19:02
@fduwjj fduwjj added ciflow/trunk Trigger trunk jobs on your pull request release notes: DeviceMesh labels Oct 26, 2025
@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

yiming0416 pushed a commit that referenced this pull request Oct 27, 2025
…hat we don't need to compare root mesh (#166003) (#166264)

Summary:

Since we are already share a flattened tensor `_rank_map` across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in #164510 and further simply the code.

We do have a more ambitious universe-based change here: #165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci

imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj

Pull Request resolved: #166264
Approved by: https://github.com/XilunWu
yiming0416 pushed a commit that referenced this pull request Oct 27, 2025
…hat we don't need to compare root mesh (#166003) (#166264)

Summary:

Since we are already share a flattened tensor `_rank_map` across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in #164510 and further simply the code.

We do have a more ambitious universe-based change here: #165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci

imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj

Pull Request resolved: #166264
Approved by: https://github.com/XilunWu
yiming0416 pushed a commit that referenced this pull request Oct 27, 2025
…hat we don't need to compare root mesh (#166003) (#166264)

Summary:

Since we are already share a flattened tensor `_rank_map` across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in #164510 and further simply the code.

We do have a more ambitious universe-based change here: #165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci

imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj

Pull Request resolved: #166264
Approved by: https://github.com/XilunWu
yiming0416 pushed a commit that referenced this pull request Oct 27, 2025
…hat we don't need to compare root mesh (#166003) (#166264)

Summary:

Since we are already share a flattened tensor `_rank_map` across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in #164510 and further simply the code.

We do have a more ambitious universe-based change here: #165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci

imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj

Pull Request resolved: #166264
Approved by: https://github.com/XilunWu
tianrengao pushed a commit that referenced this pull request Oct 30, 2025
…hat we don't need to compare root mesh (#166003) (#166264)

Summary:

Since we are already share a flattened tensor `_rank_map` across all meshes from a same root mesh, we can just use a flattened list of it to replace the comparison of root_mesh and flattened_mesh_list (because with same _rank_map and layout, the mesh tensor is guaranteed to be the same). This way we can also give back the CPU overhead added in #164510 and further simply the code.

We do have a more ambitious universe-based change here: #165680 but it needs more discussions and would lead to BC breaking. We might eventually merge that PR but probably not now and this is a change which is not BC breaking and will help concatenate and 2D integration with concatenate.

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k pragupta msaroufim dcci

imported-using-ghimport

Test Plan: Imported from OSS

Differential Revision: D85526705

Pulled By: fduwjj

Pull Request resolved: #166264
Approved by: https://github.com/XilunWu
drizzlezyk pushed a commit to Ascend/pytorch that referenced this pull request Jan 7, 2026
Co-authored-by: dilililiwhy<why.wuhuanyu@huawei.com>



# message auto-generated for no-merge-commit merge:
!28630 merge main_sync_20251202 into master

TORCH MAIN SYNC : strategy/rule registration refactoring (DTensor)

Created-by: dilililiwhy
Commit-by: dilililiwhy
Merged-by: ascend-robot
Description: <!--  Thanks for sending a pull request! 
-->

**What type of PR is this?**
> Uncomment only one ` /kind <>` line, hit enter to put that in a new line, and remove leading whitespaces from that line:
>
> /kind bug
> /kind task
> /kind feature


**What does this PR do / why do we need it**:
2.10.0.dev20251124

**Which issue(s) this PR fixes**:
<!-- 
*Automatically closes linked issue when PR is merged.
Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`.
-->
Fixes #

**Special notes for your reviewers**:
pytorch/pytorch#166264
pytorch/pytorch#167782
pytorch/pytorch#168221



See merge request: Ascend/pytorch!28630
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged meta-exported oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: DeviceMesh

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants