fix _clone_meta stride computation for torch.preserve_format by morrison-turnansky · Pull Request #161400 · pytorch/pytorch

morrison-turnansky · 2025-08-25T12:34:12Z

Fixes #161010

fixed stride issue for cloning meta tensor

pytorch-bot · 2025-08-25T12:34:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161400

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit ba184c0 with merge base 69a25f6 ():

NEW FAILURES - The following jobs have failed:

pull / linux-jammy-py3.10-clang12 / test (dynamo_wrapped, 1, 3, lf.linux.2xlarge) (gh)
test_torch.py::TestTorchDeviceTypeCPU::test_clone_not_memory_dense_cpu
pull / linux-jammy-py3.10-clang12 / test (dynamo_wrapped, 2, 3, lf.linux.2xlarge) (gh)
test_torch.py::TestTorchDeviceTypeCPU::test_memory_format_clone_cpu
pull / linux-jammy-py3.13-clang12 / test (dynamo_wrapped, 1, 3, lf.linux.2xlarge) (gh)
test_torch.py::TestTorchDeviceTypeCPU::test_clone_not_memory_dense_cpu
pull / linux-jammy-py3.13-clang12 / test (dynamo_wrapped, 2, 3, lf.linux.2xlarge) (gh)
test_torch.py::TestTorchDeviceTypeCPU::test_memory_format_clone_cpu

This comment was automatically generated by Dr. CI and updates every 15 minutes.

morrison-turnansky · 2025-08-25T15:54:51Z

@pytorchbot label "topic: not user facing"

morrison-turnansky · 2025-09-02T17:20:33Z

I was directed to include you for review. Thank you in advance, @zou3519

zou3519

Thank you for the PR, I think the semantics of preserve_format are a bit more complicated. Could you take a look please?

morrison-turnansky · 2025-09-04T19:10:04Z

@zou3519 I updated behavior. Following docs I separated the case for dense and non-dense cases. I also added a test for each case. Please let me know if you would like any additional changes.

zou3519 · 2025-09-09T18:18:28Z

+    if torch._prims_common.is_non_overlapping_and_dense(input):
+        strides = input.stride()
+    else:
+        strides = input.contiguous().stride()


I'm not completely sure this is correct. Also, assuming that it is correct, we should update utils.compute_elementwise_output_strides -- other operators run into the same problem.

This is the function we use to compute strides when the input is not "non_overlapping_and_dense": https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/TensorIterator.cpp#L1276.

If we think it is exactly input.contiguous.stride(), then we should add some unittests to check that

EDIT: .contiguous calls exactly that function so it is actually the correct thing. So my ask is that we should update utils.compute_elementwise_output_strides to call

if torch._prims_common.is_non_overlapping_and_dense(input): strides = input.stride() else: strides = input.contiguous().stride()

@zou3519 Thank you for checking the correct behavior. I updated utils.compute_elementwise_output_strides to call

if torch._prims_common.is_non_overlapping_and_dense(input): strides = input.stride() else: strides = input.contiguous().stride()

exactly when 1 tensor is given.
For the case of multiple tensors, it was not clear how to adapt this without more substantial changes. There is quite a bit of logic in compute_elementwise_output_logical_to_physical_perm. Let me know if you want me to update the behavior there as well.

No, let's fix the one tensor case first.

After we get this merged, I can look into the multiple tensor case if you'd like.

…tensor when exactly one tensor is given

zou3519 · 2025-09-12T17:06:38Z

+    if len(tensors) == 1:
+        if torch._prims_common.is_non_overlapping_and_dense(tensors[0]):
+            return tensors[0].stride()
+        else:
+            return tensors[0].contiguous().stride()


@laithsakka @bobrenjc93 any dynamic shapes issues around implementing this like this? Otherwise, if the tests pass, I will assume we can ship this.

I took a look at failing tests, still figuring it out. It is an issue that was introduced by the change with inductor.

def test_clone_not_memory_dense(): def foo(): x = torch.randn(10, 8).t()[::2, ::2] y = x.clone() return y y = foo() assert y.stride() == (1, 4) print("uncompiled") y = torch.compile(foo, backend="eager")() print("eager") assert y.stride() == (1, 4) y = torch.compile(foo, backend="aot_eager")() print("aot_eager") assert y.stride() == (1, 4) y = torch.compile(foo, backend="inductor")() print("inductor") print(y.stride()) assert y.stride() == (1, 4)

outdated

Lucaskabela · 2025-09-15T23:30:13Z

Hi @morrison-turnansky thank you for the contribution! Since this issue is UBN we would like to action on it ASAP, I have worked on top of your PR to generalize and try and address this failure - see #163017. Let's monitor signals there to see if that generalization works or has other unexpected failures and push this across the finishline :)

morrison-turnansky · 2025-09-16T11:36:36Z

@Lucaskabela Looks like the CI is passing on your new PR. Thanks for the help on this.

Lucaskabela · 2025-09-16T16:16:19Z

Of course, and thank you for all the work on this! I have your commits included in that PR for attribution, and it was extremely valuable as you have already implemented the matching semantics for len 1 case here :)

laithsakka · 2025-09-27T22:28:15Z

        return ()

+    if len(tensors) == 1:
+        if torch._prims_common.is_non_overlapping_and_dense(tensors[0]):


we prob should rename def is_non_overlapping_and_dense to
is_non_overlapping_and_dense_or_false. its fine its reasonable definition of unbacked semantics its probably would be good to ensure inductor behave similarly with unbacked inputs.

@laithsakka Thank you for the note, these commits were merged on PR, #163017. However that PR did not fully capture all the correct behavior. When the future work is implemented, I will make a note of this preserved naming scheme.

Continuation of work from #161400 and #163017. Updating stride semantics for ```clone_meta``` and underlying function, ```compute_elementwise_output_strides```. Pull Request resolved: #164252 Approved by: https://github.com/Lucaskabela

Continuation of work from pytorch#161400 and pytorch#163017. Updating stride semantics for ```clone_meta``` and underlying function, ```compute_elementwise_output_strides```. Pull Request resolved: pytorch#164252 Approved by: https://github.com/Lucaskabela

pytorchbot added the open source label Aug 25, 2025

pytorch-bot Bot added the topic: not user facing topic category label Aug 25, 2025

soulitzer requested review from IvanYashchuk and bdhirsh August 27, 2025 14:18

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 27, 2025

zou3519 previously requested changes Sep 2, 2025

View reviewed changes

morrison-turnansky force-pushed the issue-161010-dynamo-stride-clone branch 3 times, most recently from 28a91d8 to 4b0a80a Compare September 4, 2025 18:49

morrison-turnansky added 3 commits September 4, 2025 19:03

fixed stride issue for cloning meta tensor

a802ce4

cleanup

64c283a

updated memory behavior

70ca3c0

morrison-turnansky force-pushed the issue-161010-dynamo-stride-clone branch from 4b0a80a to 70ca3c0 Compare September 4, 2025 19:08

zou3519 reviewed Sep 9, 2025

View reviewed changes

updated compute_elementwise_output_strides to return strides of base …

ba184c0

…tensor when exactly one tensor is given

morrison-turnansky force-pushed the issue-161010-dynamo-stride-clone branch from 5d36a87 to ba184c0 Compare September 12, 2025 16:47

zou3519 reviewed Sep 12, 2025

View reviewed changes

laithsakka reviewed Sep 27, 2025

View reviewed changes

morrison-turnansky mentioned this pull request Sep 30, 2025

Update compute elementwise output strides semantics #164252

Closed

morrison-turnansky closed this Oct 3, 2025

morrison-turnansky deleted the issue-161010-dynamo-stride-clone branch October 3, 2025 17:42

github-actions Bot mentioned this pull request Jan 6, 2026

[Digest] 2026-01-06 e1134171019/agrnt#19

Open

Conversation

morrison-turnansky commented Aug 25, 2025

Uh oh!

pytorch-bot Bot commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161400

❌ 4 New Failures

Uh oh!

morrison-turnansky commented Aug 25, 2025

Uh oh!

morrison-turnansky commented Sep 2, 2025

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

morrison-turnansky commented Sep 4, 2025

Uh oh!

zou3519 Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

morrison-turnansky Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

morrison-turnansky Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

morrison-turnansky Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Lucaskabela commented Sep 15, 2025

Uh oh!

morrison-turnansky commented Sep 16, 2025

Uh oh!

Lucaskabela commented Sep 16, 2025

Uh oh!

laithsakka Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

morrison-turnansky Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot Bot commented Aug 25, 2025 •

edited

Loading

zou3519 Sep 9, 2025 •

edited

Loading

morrison-turnansky Sep 12, 2025 •

edited

Loading

morrison-turnansky Sep 12, 2025 •

edited

Loading