[Reland] dont decompose baddbmm by eellison · Pull Request #141045 · pytorch/pytorch

eellison · 2024-11-19T17:12:48Z

Stack from ghstack (oldest at bottom):

-> [Reland] dont decompose baddbmm #141045

Previously the decomposition would upcasts inputs to fp32. This led to a slowdown compared to eager which would run in fp16. We also tried keeping the bmm in fp16, and the upcasting for the epilogue but that led to worse numerics because the bmm in eager would do the epilogue all in fp32 without a downcast in the bmm accumulator.

Fix for #137897

Reland of #137904

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2024-11-19T17:12:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141045

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

✅ No Failures

As of commit 19ed170 with merge base 3d61801 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 403a9f7 Pull Request resolved: #141045

eellison · 2024-11-19T18:48:50Z

@pytorchbot merge

pytorchmergebot · 2024-11-19T18:50:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

eellison · 2024-11-19T19:13:32Z

@pytorchbot merge

pytorchmergebot · 2024-11-19T19:13:51Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

pytorchmergebot · 2024-11-19T19:15:45Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Previously the decomposition would upcasts inputs to fp32. This led to a slowdown compared to eager which would run in fp16. We also tried keeping the bmm in fp16, and the upcasting for the epilogue but that led to worse numerics because the bmm in eager would do the epilogue all in fp32 without a downcast in the bmm accumulator. Fix for pytorch#137897 Reland of pytorch#137904 Pull Request resolved: pytorch#141045 Approved by: https://github.com/BoyuanFeng

[Reland] dont decompose baddbmm

19ed170

[ghstack-poisoned]

pytorch-bot Bot added ciflow/inductor module: inductor labels Nov 19, 2024

eellison added a commit that referenced this pull request Nov 19, 2024

[Reland] dont decompose baddbmm

3592055

ghstack-source-id: 403a9f7 Pull Request resolved: #141045

eellison added the topic: not user facing topic category label Nov 19, 2024

eellison requested a review from desertfire November 19, 2024 17:22

eellison mentioned this pull request Nov 19, 2024

Dont decompose aten.baddmm in inductor #137904

Closed

BoyuanFeng approved these changes Nov 19, 2024

View reviewed changes

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 19, 2024

pytorchmergebot added the merging label Nov 19, 2024

pytorchmergebot closed this in 34e4205 Nov 19, 2024

pytorchmergebot added Merged and removed merging labels Nov 19, 2024

This was referenced Nov 21, 2024

[Break XPU] xpu: build fails for XPU backend due to outdated aoti_torch/generated/c_shim_xpu.h #141268

Closed

xpu: fix build after updates of torchgen/aoti/fallback_ops.py #141269

Closed

github-actions Bot deleted the gh/eellison/730/head branch December 20, 2024 02:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Reland] dont decompose baddbmm#141045

[Reland] dont decompose baddbmm#141045
eellison wants to merge 1 commit into
gh/eellison/730/basefrom
gh/eellison/730/head

eellison commented Nov 19, 2024 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Nov 19, 2024 •

edited

Loading

Uh oh!

eellison commented Nov 19, 2024

Uh oh!

pytorchmergebot commented Nov 19, 2024

Uh oh!

eellison commented Nov 19, 2024

Uh oh!

pytorchmergebot commented Nov 19, 2024

Uh oh!

pytorchmergebot commented Nov 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

eellison commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141045

❗ 1 Active SEVs

✅ No Failures

Uh oh!

eellison commented Nov 19, 2024

Uh oh!

pytorchmergebot commented Nov 19, 2024

Merge started

Uh oh!

eellison commented Nov 19, 2024

Uh oh!

pytorchmergebot commented Nov 19, 2024

Uh oh!

pytorchmergebot commented Nov 19, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eellison commented Nov 19, 2024 •

edited

Loading

pytorch-bot Bot commented Nov 19, 2024 •

edited

Loading