Skip to content

[Reland] dont decompose baddbmm#141045

Closed
eellison wants to merge 1 commit into
gh/eellison/730/basefrom
gh/eellison/730/head
Closed

[Reland] dont decompose baddbmm#141045
eellison wants to merge 1 commit into
gh/eellison/730/basefrom
gh/eellison/730/head

Conversation

@eellison

@eellison eellison commented Nov 19, 2024

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

Previously the decomposition would upcasts inputs to fp32. This led to a slowdown compared to eager which would run in fp16. We also tried keeping the bmm in fp16, and the upcasting for the epilogue but that led to worse numerics because the bmm in eager would do the epilogue all in fp32 without a downcast in the bmm accumulator.

Fix for #137897

Reland of #137904

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

@pytorch-bot

pytorch-bot Bot commented Nov 19, 2024

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141045

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 19ed170 with merge base 3d61801 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

eellison added a commit that referenced this pull request Nov 19, 2024
ghstack-source-id: 403a9f7
Pull Request resolved: #141045
@eellison eellison added the topic: not user facing topic category label Nov 19, 2024
@eellison eellison requested a review from desertfire November 19, 2024 17:22
@eellison

Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 19, 2024
@pytorchmergebot

Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@eellison

Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorchmergebot

Copy link
Copy Markdown
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@pytorchmergebot

Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit to jakeharmon8/pytorch that referenced this pull request Nov 20, 2024
Previously the decomposition would upcasts inputs to fp32. This led to a slowdown compared to eager which would run in fp16. We also tried keeping the bmm in fp16, and the upcasting for the epilogue but that led to worse numerics because the bmm in eager would do the epilogue all in fp32 without a downcast in the bmm accumulator.

Fix for pytorch#137897

Reland of pytorch#137904

Pull Request resolved: pytorch#141045
Approved by: https://github.com/BoyuanFeng
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
Previously the decomposition would upcasts inputs to fp32. This led to a slowdown compared to eager which would run in fp16. We also tried keeping the bmm in fp16, and the upcasting for the epilogue but that led to worse numerics because the bmm in eager would do the epilogue all in fp32 without a downcast in the bmm accumulator.

Fix for pytorch#137897

Reland of pytorch#137904

Pull Request resolved: pytorch#141045
Approved by: https://github.com/BoyuanFeng
@github-actions github-actions Bot deleted the gh/eellison/730/head branch December 20, 2024 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants