[BLAS] Avoid downcasts for fp16fp16->fp32 BLAS#161999
[BLAS] Avoid downcasts for fp16fp16->fp32 BLAS#161999malfet wants to merge 1 commit intogh/malfet/504/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161999
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 3 PendingAs of commit c3df264 with merge base 6737e2c ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Starting merge as part of PR stack under #162001 |
Followup after #154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: #162001 Approved by: https://github.com/drisspg ghstack dependencies: #161999
|
@pytorchbot revert -m "break a few internal tests" -c ghfirst |
|
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit b40d943. Reverted #162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](#161999 (comment)))
This reverts commit 02c83f1. Reverted #161999 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](#161999 (comment)))
|
@malfet your PR has been successfully reverted. |
|
@pytorchbot merge -f "Not sure why it was reverted in the 1st place..." |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
This reverts commit b40d943. Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
This reverts commit 02c83f1. Reverted pytorch#161999 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
Followup after pytorch#154012 Fixes CPU part of pytorch#160841 Pull Request resolved: pytorch#161999 Approved by: https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 ) Pull Request resolved: pytorch#162401 Approved by: https://github.com/Skylion007, https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Followup after pytorch#154012 Fixes CPU part of pytorch#160841 Pull Request resolved: pytorch#161999 Approved by: https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
This reverts commit b40d943. Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
This reverts commit 02c83f1. Reverted pytorch#161999 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
Followup after pytorch#154012 Fixes CPU part of pytorch#160841 Pull Request resolved: pytorch#161999 Approved by: https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 ) Pull Request resolved: pytorch#162401 Approved by: https://github.com/Skylion007, https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
This reverts commit b40d943. Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
This reverts commit 02c83f1. Reverted pytorch#161999 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
Followup after pytorch#154012 Fixes CPU part of pytorch#160841 Pull Request resolved: pytorch#161999 Approved by: https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 ) Pull Request resolved: pytorch#162401 Approved by: https://github.com/Skylion007, https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Followup after pytorch#154012 Fixes CPU part of pytorch#160841 Pull Request resolved: pytorch#161999 Approved by: https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
This reverts commit b40d943. Reverted pytorch#162001 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
This reverts commit 02c83f1. Reverted pytorch#161999 on behalf of https://github.com/jeanschmidt due to break a few internal tests ([comment](pytorch#161999 (comment)))
Followup after pytorch#154012 Fixes CPU part of pytorch#160841 Pull Request resolved: pytorch#161999 Approved by: https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Discovered while debugging pytorch#160841 where sdpa returned NaNs, because during the computation intermediate values were cast back to fp16 before normalization, which was fixed by pytorch#161999 ) Pull Request resolved: pytorch#162401 Approved by: https://github.com/Skylion007, https://github.com/drisspg
Followup after pytorch#154012 Since the introduction of `gemm_no_downcast_stub` it's no longer necessary to allocate temporary array and then manually implement the `beta` logic in the codebase Pull Request resolved: pytorch#162001 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#161999
Stack from ghstack (oldest at bottom):
gemm#162001Followup after #154012
Fixes CPU part of #160841
cc @jianyuh @nikitaved @mruberry @walterddr @xwang233 @lezcano