[PyTorch] Hook up fp16_gemv_trans to gemv fast path for non-aarch64 architectures#138005
[PyTorch] Hook up fp16_gemv_trans to gemv fast path for non-aarch64 architectures#138005swolchok wants to merge 12 commits intogh/swolchok/667/basefrom
Conversation
…rchitectures Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138005
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 0bd8d1f with merge base 86602a6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…rchitectures Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) ghstack-source-id: 248140851 Pull Request resolved: #138005
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…rchitectures Pull Request resolved: #138005 Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. ghstack-source-id: 249911241 @exported-using-ghexport Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/)
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
malfet
left a comment
There was a problem hiding this comment.
Sure, but we should have half_one and half_zero constants somewhere to avoid doing casts all the time
| alpha.x == fp16_ieee_from_fp32_value(1.0f) && | ||
| beta.x == fp16_ieee_from_fp32_value(0.0f); |
There was a problem hiding this comment.
fp16_ieee_from_fp32_value calls are very slow, but we have one and zero constants, don't we?
There was a problem hiding this comment.
fp16_ieee_from_fp32_value calls are very slow
above comment demonstrates it's constant-foldable
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
…n-aarch64 architectures" Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D64351092 |
No real reason to have the zero-beta restriction, so let's lift it. Testing: intentionally broke new paths locally to verify test coverage existed Differential Revision: [D64407752](https://our.internmc.facebook.com/intern/diff/D64407752/) Pull Request resolved: #138275 Approved by: https://github.com/malfet ghstack dependencies: #139082, #139083, #137918, #138005
…rchitectures (pytorch#138005) Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv. Differential Revision: [D64351092](https://our.internmc.facebook.com/intern/diff/D64351092/) Pull Request resolved: pytorch#138005 Approved by: https://github.com/malfet ghstack dependencies: pytorch#139082, pytorch#139083, pytorch#137918
No real reason to have the zero-beta restriction, so let's lift it. Testing: intentionally broke new paths locally to verify test coverage existed Differential Revision: [D64407752](https://our.internmc.facebook.com/intern/diff/D64407752/) Pull Request resolved: pytorch#138275 Approved by: https://github.com/malfet ghstack dependencies: pytorch#139082, pytorch#139083, pytorch#137918, pytorch#138005
Stack from ghstack (oldest at bottom):
Following up on previous rev to use fp16_gemv_trans in gemv, not just gemm-used-for-gemv.
Differential Revision: D64351092