[DTensor] Prevent squeeze from redistributing with strict_view by stmcgovern · Pull Request #175798 · pytorch/pytorch

stmcgovern · 2026-02-25T21:51:07Z

Fixes #174136
When squeeze_.dim triggers redistribution (e.g. squeezing a globally-singleton sharded dim forces Shard(0) → Replicate), redistribute_local_args creates a new local tensor. The dispatch path updated _spec but never assigned the new tensor back to _local_tensor, leaving the DTensor in an inconsistent state where the spec claims shape [4] but the local tensor is still [1, 4].

Update: Prevent squeeze, as a view op, from doing a redistribution using the view_strict keyword.

~~This PR syncs _spec and _local_tensor. It uses the inplace_view tag, so other inplace view ops (unsqueeze_, t_, transpose_) won't hit this, when strategies are registered for them.~~

The C++ fast path in dispatchDTensorOp explicitly skips inplace ops, so no C++ change is needed.

pytorch-bot · 2026-02-25T21:51:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175798

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ce19418 with merge base 9daaaa2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

wconstab · 2026-03-10T16:20:04Z

@claude review this PR

claude · 2026-03-10T16:20:44Z

Claude finished @wconstab's task in 4m 38s —— View job

PR Review: #175798

Summary

This PR fixes a real bug where squeeze_.dim (and squeeze_.dims) could leave a DTensor in an inconsistent state: _spec would be updated after redistribution but _local_tensor would still hold the pre-redistribution tensor. The fix generalizes the special case from squeeze_.dim-only to all inplace_view-tagged ops, which is the right approach.

Code Quality

The change is well-scoped and the generalization from a single op check (op_call == aten.squeeze_.dim) to a tag-based check (torch.Tag.inplace_view in op_call.tags) is a clear improvement. The comments are concise and useful.

One observation worth discussing:

torch/distributed/tensor/_dispatch.py:485-486 — The assertion isinstance(local_results, torch.Tensor) is good defensive code, but consider whether local_results could ever be None here. When participating is False, local_results remains None (set at _dispatch.py:286) and flows into _dispatch_fast_path_python_tail. However, this code path requires output_sharding.output_spec is not None (line 472) and is_inplace_op (line 470), and for a non-participating rank the local_results would be set to an empty tensor at line 418-420 (since output_spec is not None). So the assertion should hold. This is fine as-is.
Comment nit (_dispatch.py:480-483): The comment says "redistribution may produce a new local tensor" — this is accurate but could be slightly more precise. The key insight is that redistribute_local_args modifies op_info.local_args (creating a new tensor), and then the local op executes on that new tensor, so local_results points to a different tensor than args[0]._local_tensor. This is minor and not blocking.

Testing

Tests exist for the fix (test_squeeze_inplace_local_tensor)
Tests cover the redistribution case (squeeze singleton sharded dim) and the no-op case (squeeze non-singleton dim)
Tests verify both shape and _local_tensor.shape consistency, plus correctness via full_tensor()

The tests are well-structured and directly target the bug. Two suggestions:

squeeze_.dims coverage: The dispatch fix now covers all inplace_view ops, and squeeze_.dims is a registered strategy (_view_ops.py:813-815). Consider adding a test case for squeeze_.dims since it can trigger the same redistribution path. For example: dt.squeeze_((0,)) on the same [1, 4] sharded tensor.
Future inplace_view ops: The PR description notes that unsqueeze_, t_, transpose_ don't have registered strategies yet, so they won't hit this path. When strategies are added for those ops, they'll need tests too, but that's future work and out of scope here.

Security

No security concerns identified.

Thread Safety

No thread safety concerns — this code path is per-op dispatch, not shared mutable state.

Backward Compatibility

No BC-breaking changes. This is purely a bug fix:

Before: squeeze_ with redistribution left DTensor in an inconsistent state (bug)
After: squeeze_ with redistribution correctly syncs both _spec and _local_tensor

The generalization from aten.squeeze_.dim to torch.Tag.inplace_view is forward-compatible — any future inplace view op that gets a strategy will automatically benefit from this fix without additional dispatch changes.

Performance

No performance concerns. The torch.Tag.inplace_view in op_call.tags check is a set membership test on a small set and replaces an equality check — negligible difference. The additional _local_tensor assignment is a simple attribute write.

Recommendation

Approve

This is a correct, well-targeted bug fix with good test coverage. The generalization from a single-op special case to tag-based dispatch is the right design choice. The only minor suggestion is adding a squeeze_.dims test case for completeness, but it's not blocking.

wconstab · 2026-03-10T21:53:39Z

-                    # use return_and_correct_aliasing to match the outer and the inner
-                    # aliasing. See https://github.com/pytorch/pytorch/pull/158954
+                    if not isinstance(local_results, torch.Tensor):
+                        raise AssertionError


nit: add text about what type the local result is

wconstab · 2026-03-10T21:55:15Z

-                    # update the spec to handle tensor meta changes
+                if torch.Tag.inplace_view in op_call.tags:
+                    # Inplace view ops may change tensor meta. Update both
+                    # spec and local tensor -- redistribution may produce a


nit: comment says 'redistribution may produce..' but i think in this path a redistribution is actually not happening- its purely a metadata change right?

wconstab · 2026-03-10T21:57:12Z

+        dt = distribute_tensor(x, mesh, [Shard(0)])
+        dt.squeeze_(0)
+        self.assertEqual(dt.shape, torch.Size([4]))
+        self.assertEqual(dt._local_tensor.shape, torch.Size([4]))


now i'm wondering- is this squeeze actually doing a redistribution? or just changing placements (aka view operation)? lets add comm counter to disambiguate this in the test. (and update the comment above if needed)

Respond to wconstab's review comments on PR pytorch#175798: - Clarify comment about when/why local tensor writeback is needed - Add error message to AssertionError for debuggability - Add CommDebugMode assertions to verify allgather happens for Shard->Replicate squeeze and zero comms for no-op squeeze

stmcgovern · 2026-03-17T20:51:18Z

reordered the checks to address the benchmark regression CI failure.

stmcgovern · 2026-03-18T01:44:33Z

@wconstab this one is ready for review when you have the chance

wconstab · 2026-03-18T23:59:32Z

+        dt = distribute_tensor(x, mesh, [Shard(0)])
+        comm_mode = CommDebugMode()
+        with comm_mode:
+            dt.squeeze_(0)


related to my original comment, it seems like this PR is going against the design for dtensor views. Views are supposed to be gauranteed fast and sharing same memory as the original tensor, so doing a redistribution implicitly during a view operation is semantics breaking even if it's convenient in some cases. We prefer to error as unsupported in such cases and require the user intervene to do a redistribution first manually and then perform a true view operation

I see. Yes this PR is cleaning up after a redistribution. But that redistribution should be banned. I'll use strict_view=True for squeeze, following #149764

stmcgovern · 2026-04-14T17:37:17Z

these failures seem unrelated. going to try to merge

stmcgovern · 2026-04-14T17:37:27Z

@pytorchbot merge

pytorchmergebot · 2026-04-14T17:50:21Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-04-14T17:50:44Z

Merge failed

Reason: 2 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

… allgather Squeeze ops are views and must be communication-free. Enable strict_view=True for all squeeze registrations so squeezing a sharded singleton dim raises instead of silently allgathering. Generalize the inplace dispatch handler to check the redistribute schema for actual placement changes rather than special-casing squeeze_ ops by name. Fixes pytorch#174136

Use placement tuple comparison instead of DTensorSpec.__eq__ on the inplace fast path. DTensorSpec.__eq__ compares mesh, placements, shard_order, shape, stride, and dtype — unnecessary for the fast-path decision where only placements matter. This restores the instruction count to match main (56530 expected vs 66916 with __eq__).

…guard Remove the needs_redistribute check for inplace ops in dispatch. With strict_view=True on squeeze ops, sharding propagation already errors when an inplace view would require redistribution of arg 0. The remaining code handles two cases: 1. Placements unchanged (add_, mul_, etc.) — return self 2. Placements reindexed (squeeze_ removing non-sharded dim: Shard(1) → Shard(0)) — update spec, no communication

stmcgovern · 2026-04-21T13:12:04Z

@pytorchbot merge

pytorchmergebot · 2026-04-21T13:14:20Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot Bot added the release notes: distributed (dtensor) release notes category label Feb 25, 2026

pytorchbot added the open source label Feb 25, 2026

mikaylagawarecki requested a review from tianyu-l March 5, 2026 18:45

mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 5, 2026

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch 2 times, most recently from e2dfacb to dd7b23b Compare March 10, 2026 14:30

stmcgovern mentioned this pull request Mar 10, 2026

[DTensor] Add sharding strategy for aten.squeeze.dims #173563

Closed

wconstab reviewed Mar 10, 2026

View reviewed changes

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from dd7b23b to 071ed37 Compare March 11, 2026 14:18

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from 071ed37 to 7ac3e4c Compare March 11, 2026 14:36

stmcgovern requested a review from wconstab March 11, 2026 15:10

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch 2 times, most recently from d88c851 to 67adb02 Compare March 17, 2026 18:39

wconstab reviewed Mar 18, 2026

View reviewed changes

stmcgovern mentioned this pull request Mar 25, 2026

[DTensor] support DTensor view (flatten/unflatten) with _StridedSharding #166483

Closed

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from 856a96d to 85dc8d8 Compare March 27, 2026 20:30

pytorch-bot Bot added ciflow/dtensor Run DTensor specific tests ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests labels Mar 27, 2026

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from 0885f50 to 364f8d0 Compare March 30, 2026 02:47

wconstab reviewed Apr 2, 2026

View reviewed changes

Comment thread torch/distributed/tensor/_dispatch.py Outdated

zpcore self-requested a review April 2, 2026 18:26

aditvenk requested a review from pianpwk April 6, 2026 17:00

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from 12736e2 to ff4769b Compare April 10, 2026 15:28

zpcore approved these changes Apr 12, 2026

View reviewed changes

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from ff4769b to 8582689 Compare April 13, 2026 15:45

stmcgovern requested a review from weifengpy as a code owner April 13, 2026 15:45

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from 8582689 to 9b58744 Compare April 14, 2026 13:28

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 14, 2026

pytorchmergebot added the merging label Apr 14, 2026

pytorchmergebot removed the merging label Apr 14, 2026

stmcgovern changed the title ~~[DTensor] Sync local tensor for inplace view ops after redistribution~~ [DTensor] Prevent squeeze from redistributing with strict_view Apr 14, 2026

pytorch-bot Bot added ciflow/dtensor Run DTensor specific tests ciflow/inductor ciflow/torchtitan Run TorchTitan integration tests labels Apr 14, 2026

stmcgovern added 3 commits April 21, 2026 01:17

stmcgovern force-pushed the fix-squeeze-inplace-local-tensor branch from 9b58744 to ce19418 Compare April 21, 2026 01:18

pytorchmergebot added the merging label Apr 21, 2026

pytorchmergebot added the Merged label Apr 21, 2026

pytorchmergebot closed this in 71d3bd2 Apr 21, 2026

pytorchmergebot removed the merging label Apr 21, 2026

Conversation

stmcgovern commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175798

✅ No Failures

Uh oh!

wconstab commented Mar 10, 2026

Uh oh!

claude Bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: #175798

Summary

Code Quality

Testing

Security

Thread Safety

Backward Compatibility

Performance

Recommendation

Uh oh!

wconstab Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

wconstab Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

stmcgovern commented Mar 17, 2026

Uh oh!

stmcgovern commented Mar 18, 2026

Uh oh!

wconstab Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

stmcgovern Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stmcgovern commented Apr 14, 2026

Uh oh!

stmcgovern commented Apr 14, 2026

Uh oh!

pytorchmergebot commented Apr 14, 2026

Merge started

Uh oh!

pytorchmergebot commented Apr 14, 2026

Merge failed

Uh oh!

stmcgovern commented Apr 21, 2026

Uh oh!

pytorchmergebot commented Apr 21, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

stmcgovern commented Feb 25, 2026 •

edited

Loading

pytorch-bot Bot commented Feb 25, 2026 •

edited

Loading

claude Bot commented Mar 10, 2026 •

edited

Loading