[ROCm] Use ieee precision for fp32 in flex attention#135702
[ROCm] Use ieee precision for fp32 in flex attention#135702jataylo wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135702
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 Cancelled Jobs, 5 Unrelated FailuresAs of commit eddebed with merge base 6700175 ( CANCELLED JOBS - The following jobs were cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Hmm failures are probably not related. I'll rebase and see if they are green |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
748e495 to
eddebed
Compare
|
@pytorchbot merge -f "Fix ROCm CI failures in inductor/test_flex_encoding.py" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
pytorch@3bebc09 Brought in a change to flex_attention to allow TF32 precision, this largely lacks support on ROCm side and we should use ieee. Pull Request resolved: pytorch#135702 Approved by: https://github.com/jeffdaily, https://github.com/drisspg
|
The cherry pick PR is at #136557 |
* [ROCm] skip test_fp8_cast_and_t on non-MI300 machines (#135917) Fixes #ISSUE_NUMBER Pull Request resolved: #135917 Approved by: https://github.com/malfet (cherry picked from commit 6cdc70b) * Skip pointwise associative scan tests due to regression (changes based on PR #135995) * Cherry-pick fix from #135702 --------- Co-authored-by: Prachi Gupta <prachi.gupta@amd.com> Co-authored-by: Jithun Nair <jithun.nair@amd.com>
3bebc09
Brought in a change to flex_attention to allow TF32 precision, this largely lacks support on ROCm side and we should use ieee.
cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @hongxiayang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang