[PyTorch] Port ExecuTorch bfdot improvement back to ATen BlasKernel, Try #2#137377
[PyTorch] Port ExecuTorch bfdot improvement back to ATen BlasKernel, Try #2#137377swolchok wants to merge 7 commits intogh/swolchok/649/basefrom
Conversation
…Try #2 ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137377
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit fcb5288 with merge base de4c2a3 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
…Try #2 ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) ghstack-source-id: 246411194 Pull Request resolved: #137377
|
@pytorchbot label "ciflow/linux-aarch64" |
…lasKernel, Try #2" ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
…lasKernel, Try #2" ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
…Try #2 Pull Request resolved: #137377 ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . ghstack-source-id: 246616406 Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/)
|
What's the difference with the previous attempt? |
aten/src/ATen/native/BlasKernel.cpp
Outdated
| return reduce(sum); | ||
| } | ||
|
|
||
| // NOTE: The first attempt at landing BFDOT support with |
There was a problem hiding this comment.
Ho thanks for the pointer!
Should we undef DOT_WITH_FP32_ARITH_TAIL_AFTER_MAIN_LOOP_BODY ?
There was a problem hiding this comment.
this is a .cpp file, so the macro isn't going to leak anywhere and it's not particularly necessary, but sure I can do that.
…lasKernel, Try #2" ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
… back to ATen BlasKernel, Try #2" ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
…Ten BlasKernel, Try #2" ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
…Try #2 Pull Request resolved: #137377 ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . ghstack-source-id: 246956192 Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/)
|
CI is 100% green; please review |
…lasKernel, Try #2" ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 . Differential Revision: [D63923166](https://our.internmc.facebook.com/intern/diff/D63923166/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D63923166 |
|
If CI is green, than sure, LGTM |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
defined(__aarch64__) && !defined(CPU_CAPABILITY_SVE256)instead ofdefined(CPU_CAPABILITY_NEON)#137722ExecuTorch's fork of BlasKernel.cpp grew bfdot support, complete with demonstration that it helps. Port it back to PyTorch. First attempt was #136331 .
Differential Revision: D63923166