[aoti] refactor aoti_torch__scaled_mm and skip aoti fp8 test for some cases#130868
[aoti] refactor aoti_torch__scaled_mm and skip aoti fp8 test for some cases#130868henrylhtsang wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130868
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit d773a7c with merge base e880cb2 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D59833074 |
chenyang78
left a comment
There was a problem hiding this comment.
On a second thought, we may actually want to add a new API with "_v2" suffix, since changing the arguments breaks the backward compatibility.
cc @desertfire
|
This pull request was exported from Phabricator. Differential Revision: D59833074 |
9bcea20 to
183edd4
Compare
Summary: Pull Request resolved: #130868 Differential Revision: D59833074
|
@chenyang78 I added the v2 functions. Should I modify the implementation of v1 to use that of v2 as well? btw does aot inductor know to use v2 automatically? EDIT: or should I do something like https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp_wrapper_cpu.py#L1177 |
183edd4 to
6a3d664
Compare
|
This pull request was exported from Phabricator. Differential Revision: D59833074 |
|
Verified with internal testing that the tests are passing for those interested, I tested with D59833073 |
…130868) Summary: Pull Request resolved: pytorch#130868 Differential Revision: D59833074
6a3d664 to
d773a7c
Compare
|
This pull request was exported from Phabricator. Differential Revision: D59833074 |
|
@pytorchbot merge |
Merge failedReason: This PR needs a If not, please add the To add a label, you can comment to pytorchbot, for example For more information, see Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot label "topic: not user facing" |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
… cases (pytorch#130868) Continuing pytorch#128683 and pytorch#130582. The api of _scaled_mm has changed. For example, there is only one return now. So change the aoti api as well. Also, tested the fp8 tests offline. The test_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface would fail with `error: use of undeclared identifier 'float8_e4m3fn'` and `error: use of undeclared identifier 'half'`, so skipping them for now. The reason this wasn't known earlier is probably because the CI doesn't use H100. Pull Request resolved: pytorch#130868 Approved by: https://github.com/drisspg, https://github.com/chenyang78, https://github.com/desertfire
Continuing #128683 and #130582.
The api of _scaled_mm has changed. For example, there is only one return now. So change the aoti api as well.
Also, tested the fp8 tests offline. The test_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface would fail with
error: use of undeclared identifier 'float8_e4m3fn'anderror: use of undeclared identifier 'half', so skipping them for now.The reason this wasn't known earlier is probably because the CI doesn't use H100.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang