[DTensor] no-op redistribution shouldn't create _TransformInfo#172924
[DTensor] no-op redistribution shouldn't create _TransformInfo#172924wconstab wants to merge 7 commits intogh/wconstab/505/basefrom
Conversation
Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/172924
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1b71100 with merge base 4ac0624 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…Info" Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
…Info" Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
| return local_tensor | ||
|
|
||
| # Short-circuit if placements are already equal (no-op redistribution) | ||
| if current_spec.placements == target_spec.placements: |
There was a problem hiding this comment.
@zpcore I confirmed that if I comment out these 2 lines, the tests all pass. With this short circuit, I get a hang in python test/distributed/tensor/test_redistribute.py DistributeWithDeviceOrderTest.test_ordered_redistribute, and I didn't debug it yet but I wonder if you can think of any reason why this could happen.
There was a problem hiding this comment.
This is because even though current_spec.placements == target_spec.placements, current_spec.shard_order and target_spec.shard_order can still be different, which requires a redistribution. We can update this line to:
if current_spec.placements == target_spec.placements and current_spec.shard_order == target_spec.shard_order:
There was a problem hiding this comment.
thanks, that makes sense!!
…Info" Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
…Info" Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
…Info" Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
…Info" Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. ghstack-source-id: b83774e Pull Request resolved: pytorch/pytorch#172924
Pull Request resolved: #172609 Approved by: https://github.com/zpcore, https://github.com/tianyu-l ghstack dependencies: #172924
…ch#172924) Previously, generate_greedy_transform_infos was happy to generate a TransformInfo with src placements == dst placements, which was pointless. Graph-based algo was safe, avoiding this. Pull Request resolved: pytorch#172924 Approved by: https://github.com/zpcore
) Pull Request resolved: pytorch#172609 Approved by: https://github.com/zpcore, https://github.com/tianyu-l ghstack dependencies: pytorch#172924
Stack from ghstack (oldest at bottom):
Previously, generate_greedy_transform_infos was happy to generate a
TransformInfo with src placements == dst placements, which was
pointless. Graph-based algo was safe, avoiding this.