[DTensor] layernorm output meta by pianpwk · Pull Request #175652 · pytorch/pytorch

pianpwk · 2026-02-24T20:00:52Z

Stack from ghstack (oldest at bottom):

autoparallel has its own layernorm fwd/bwd registrations, because pytorch's version reports the wrong tensor meta for (out, mean, rstd), by coping out's meta to all 3. This creates separate metas for each.

The other issue is layernorm always produces contiguous outputs - fixes this too.

https://github.com/meta-pytorch/autoparallel/blob/454780d2a27456a380c0d8e997c8fc2cf82ef5d8/autoparallel/shardings/propagation_rules.py#L460-L611

[ghstack-poisoned]

pytorch-bot · 2026-02-24T20:00:57Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175652

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 1733897 with merge base e81980e ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / inductor-test / test (inductor_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
lennard_jones

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 258ae66 Pull Request resolved: #175652

[ghstack-poisoned]

pianpwk · 2026-02-25T23:28:21Z

@pytorchbot merge

pytorchmergebot · 2026-02-25T23:30:27Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2026-02-26T05:28:44Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

pianpwk · 2026-02-26T06:03:35Z

@pytorchbot merge

pytorchmergebot · 2026-02-26T06:05:34Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

autoparallel has its own layernorm fwd/bwd registrations, because pytorch's version reports the wrong tensor meta for (out, mean, rstd), by coping out's meta to all 3. This creates separate metas for each. The other issue is layernorm always produces contiguous outputs - fixes this too. https://github.com/meta-pytorch/autoparallel/blob/454780d2a27456a380c0d8e997c8fc2cf82ef5d8/autoparallel/shardings/propagation_rules.py#L460-L611 Pull Request resolved: pytorch#175652 Approved by: https://github.com/wconstab

ghstack-source-id: d09bfc5 Pull Request resolved: pytorch/pytorch#175652

…_layer_norm, and native_layer_norm_backward These three rules were carried as local overrides in autoparallel while upstream PyTorch lacked proper handling: - constant_pad_nd: non-replicate strategy filtering on padded dims (upstreamed in pytorch/pytorch#175656) - native_layer_norm forward: correct per-output shapes and contiguous strides (upstreamed in pytorch/pytorch#175652) - native_layer_norm backward: contiguous stride handling for grad_input (upstreamed in a companion PR to pytorch/pytorch) With all three fixes now in upstream PyTorch, the overrides can be removed and autoparallel defers to the upstream register_op_strategy implementations. Authored with Claude.

autoparallel has its own layernorm fwd/bwd registrations, because pytorch's version reports the wrong tensor meta for (out, mean, rstd), by coping out's meta to all 3. This creates separate metas for each. The other issue is layernorm always produces contiguous outputs - fixes this too. https://github.com/meta-pytorch/autoparallel/blob/454780d2a27456a380c0d8e997c8fc2cf82ef5d8/autoparallel/shardings/propagation_rules.py#L460-L611 Pull Request resolved: pytorch#175652 Approved by: https://github.com/wconstab

[DTensor] layernorm/min/max output meta

5eb880f

[ghstack-poisoned]

pianpwk added a commit that referenced this pull request Feb 24, 2026

[DTensor] layernorm/min/max output meta

059e8ad

ghstack-source-id: 258ae66 Pull Request resolved: #175652

pytorch-bot Bot added ciflow/inductor release notes: distributed (dtensor) release notes category labels Feb 24, 2026

pianpwk mentioned this pull request Feb 24, 2026

[DTensor] constant_pad_nd non-replicate strategy #175656

Closed

Update on "[DTensor] layernorm/min/max output meta"

1733897

[ghstack-poisoned]

pianpwk mentioned this pull request Feb 25, 2026

[DTensor] fix max.dim/min.dim strategy #175776

Closed

pianpwk changed the title ~~[DTensor] layernorm/min/max output meta~~ [DTensor] layernorm output meta Feb 25, 2026

pianpwk marked this pull request as ready for review February 25, 2026 20:46

pianpwk requested review from anshul-si, wconstab and zpcore February 25, 2026 20:47

wconstab approved these changes Feb 25, 2026

View reviewed changes

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 25, 2026

pytorchmergebot added the merging label Feb 25, 2026

pytorchmergebot closed this in 3fe5249 Feb 26, 2026

pytorchmergebot added Merged and removed merging labels Feb 26, 2026

sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026

[DTensor] layernorm output meta

7c0b10a

ghstack-source-id: d09bfc5 Pull Request resolved: pytorch/pytorch#175652

This was referenced Mar 13, 2026

fused_rms_norm_backward_strategy doesn't support different input placements meta-pytorch/autoparallel#142

Closed

Add sharding rules for convolution, uniform, scatter, and index ops meta-pytorch/autoparallel#370

Merged

pianpwk mentioned this pull request Mar 18, 2026

Remove upstreamed sharding rule overrides for constant_pad_nd, native_layer_norm, and native_layer_norm_backward meta-pytorch/autoparallel#373

Draft

github-actions Bot deleted the gh/pianpwk/100/head branch March 29, 2026 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] layernorm output meta#175652

[DTensor] layernorm output meta#175652
pianpwk wants to merge 2 commits intogh/pianpwk/100/basefrom
gh/pianpwk/100/head

pianpwk commented Feb 24, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

pianpwk commented Feb 25, 2026

Uh oh!

pytorchmergebot commented Feb 25, 2026

Uh oh!

pytorchmergebot commented Feb 26, 2026

Uh oh!

pianpwk commented Feb 26, 2026

Uh oh!

pytorchmergebot commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pianpwk commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175652

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

pianpwk commented Feb 25, 2026

Uh oh!

pytorchmergebot commented Feb 25, 2026

Merge started

Uh oh!

pytorchmergebot commented Feb 26, 2026

Uh oh!

pianpwk commented Feb 26, 2026

Uh oh!

pytorchmergebot commented Feb 26, 2026

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pianpwk commented Feb 24, 2026 •

edited

Loading

pytorch-bot Bot commented Feb 24, 2026 •

edited

Loading