Skip to content

Update compute elementwise output strides semantics#164252

Closed
morrison-turnansky wants to merge 6 commits intopytorch:mainfrom
morrison-turnansky:update-compute_elementwise_output_strides-semantics
Closed

Update compute elementwise output strides semantics#164252
morrison-turnansky wants to merge 6 commits intopytorch:mainfrom
morrison-turnansky:update-compute_elementwise_output_strides-semantics

Conversation

@morrison-turnansky
Copy link
Collaborator

@morrison-turnansky morrison-turnansky commented Sep 30, 2025

Continuation of work from #161400 and #163017.

Updating stride semantics for clone_meta and underlying function, compute_elementwise_output_strides.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @chenyang78

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 30, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164252

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 57f8206 with merge base 8d53968 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@morrison-turnansky
Copy link
Collaborator Author

@Lucaskabela I thought it would be cleaner to start on new PR.

The above fixes the case when one tensor is given. The reason this works is that torch.empty_like uses the c++ implementation of infer_dense_strides under the hood, but this requires some extra allocations.

As an alternative fix to avoid this extra cost, there is a python implementation of the above function, namely torch._inductor.kernel.flex.infer_dense_strides. In order to use it, I think we should/would have to move it out of _inductor. Let me know what you think.

@morrison-turnansky
Copy link
Collaborator Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Sep 30, 2025
@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 2, 2025
# If not, eager clone creates contiguous strides
computed_stride = None
if utils.is_non_overlapping_and_dense(input):
computed_stride = input.stride()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems removing this line is causing the unit test for this bug to fail :/

Copy link
Collaborator Author

@morrison-turnansky morrison-turnansky Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lucaskabela which failing test are you referring to?

The only failing test I saw in the CI was
test/test_dynamic_shapes.py: test_prims_non_overlapping_and_dense.

I had missed renaming one of the function calls. Updated and pushed change.

Copy link
Contributor

@Lucaskabela Lucaskabela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this change still has some error to work through

@Lucaskabela
Copy link
Contributor

I took a look at this function - I think we can refactor it out to leverage it here

@morrison-turnansky morrison-turnansky force-pushed the update-compute_elementwise_output_strides-semantics branch 2 times, most recently from d473557 to 17f9b97 Compare October 3, 2025 14:06
@morrison-turnansky morrison-turnansky force-pushed the update-compute_elementwise_output_strides-semantics branch from 2eaf53d to 7acf73c Compare November 4, 2025 16:53
@morrison-turnansky
Copy link
Collaborator Author

morrison-turnansky commented Nov 4, 2025

@Lucaskabela Circling back on this. I had fixed the failing test in the CI pass. If there are other changes that you want can you elaborate? Unclear what the intent of this comment was. which function?

I took a look at this function - I think we can refactor it out to leverage it here

@Lucaskabela
Copy link
Contributor

Hey @morrison-turnansky I don't recall what i was referring to here unfortunately - going to cc @zou3519 here since I will be on PTO next two weeks

@Lucaskabela Lucaskabela requested a review from zou3519 November 7, 2025 17:57
Copy link
Contributor

@Lucaskabela Lucaskabela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! I think we should add test cases that this change is covering our previous change is not, otherwise LGTM

@@ -897,9 +897,9 @@ def test_non_overlapping_and_dense_unbacked(self):
)
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @zou3519 had some extra test cases for us to consider here; if not, I still think it would be worth adding an additional test case or two to make sure we are doing the right thing here

Copy link
Collaborator Author

@morrison-turnansky morrison-turnansky Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lucaskabela Following up. Any addtional action?

@Lucaskabela Lucaskabela self-requested a review November 7, 2025 17:59
@morrison-turnansky
Copy link
Collaborator Author

@Lucaskabela Enjoy the PTO!

@morrison-turnansky morrison-turnansky force-pushed the update-compute_elementwise_output_strides-semantics branch from 7acf73c to 57f8206 Compare December 9, 2025 16:59
@morrison-turnansky
Copy link
Collaborator Author

CI is passing. Please let me know if you want any additional changes, and/or resolve requested changes if you found my updates were sufficient. Thank you. Trying to tie this up before I go on PTO for end of year holidays.

Copy link
Contributor

@Lucaskabela Lucaskabela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, I think this is generally good now :)

@morrison-turnansky
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 5, 2026
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

krastogi-in pushed a commit to krastogi-in/pytorch that referenced this pull request Jan 9, 2026
Continuation of work from pytorch#161400 and pytorch#163017.

Updating stride semantics for ```clone_meta``` and underlying function, ```compute_elementwise_output_strides```.

Pull Request resolved: pytorch#164252
Approved by: https://github.com/Lucaskabela
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants