Skip to content

[aoti] refactor aoti_torch__scaled_mm and skip aoti fp8 test for some cases#130868

Closed
henrylhtsang wants to merge 1 commit intopytorch:mainfrom
henrylhtsang:export-D59833074
Closed

[aoti] refactor aoti_torch__scaled_mm and skip aoti fp8 test for some cases#130868
henrylhtsang wants to merge 1 commit intopytorch:mainfrom
henrylhtsang:export-D59833074

Conversation

@henrylhtsang
Copy link
Contributor

@henrylhtsang henrylhtsang commented Jul 16, 2024

Continuing #128683 and #130582.

The api of _scaled_mm has changed. For example, there is only one return now. So change the aoti api as well.

Also, tested the fp8 tests offline. The test_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface would fail with error: use of undeclared identifier 'float8_e4m3fn' and error: use of undeclared identifier 'half', so skipping them for now.

The reason this wasn't known earlier is probably because the CI doesn't use H100.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130868

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit d773a7c with merge base e880cb2 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D59833074

@henrylhtsang henrylhtsang changed the title [aoti] refactor _scaled_mm and skip fp8 test for some cases [aoti] refactor aoti_torch__scaled_mm and skip fp8 test for some cases Jul 16, 2024
@henrylhtsang henrylhtsang changed the title [aoti] refactor aoti_torch__scaled_mm and skip fp8 test for some cases [aoti] refactor aoti_torch__scaled_mm and skip aoti fp8 test for some cases Jul 16, 2024
Copy link

@chenyang78 chenyang78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, we may actually want to add a new API with "_v2" suffix, since changing the arguments breaks the backward compatibility.

cc @desertfire

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D59833074

pytorch-bot bot pushed a commit that referenced this pull request Jul 17, 2024
Summary: Pull Request resolved: #130868

Differential Revision: D59833074
@henrylhtsang
Copy link
Contributor Author

henrylhtsang commented Jul 17, 2024

@chenyang78 I added the v2 functions. Should I modify the implementation of v1 to use that of v2 as well?

btw does aot inductor know to use v2 automatically?

EDIT: or should I do something like https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/cpp_wrapper_cpu.py#L1177

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D59833074

@henrylhtsang
Copy link
Contributor Author

henrylhtsang commented Jul 18, 2024

Verified with internal testing that the tests are passing

for those interested, I tested with D59833073

@henrylhtsang henrylhtsang requested a review from chenyang78 July 18, 2024 22:09
…130868)

Summary: Pull Request resolved: pytorch#130868

Differential Revision: D59833074
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D59833074

Copy link

@chenyang78 chenyang78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@henrylhtsang
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 19, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@henrylhtsang
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jul 19, 2024
@henrylhtsang
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@henrylhtsang
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

xuhancn pushed a commit to xuhancn/pytorch that referenced this pull request Jul 25, 2024
… cases (pytorch#130868)

Continuing pytorch#128683 and pytorch#130582.

The api of _scaled_mm has changed. For example, there is only one return now. So change the aoti api as well.

Also, tested the fp8 tests offline. The test_fp8_abi_compatible_cpu_with_stack_allocation_and_minimal_arrayref_interface would fail with `error: use of undeclared identifier 'float8_e4m3fn'` and `error: use of undeclared identifier 'half'`, so skipping them for now.

The reason this wasn't known earlier is probably because the CI doesn't use H100.

Pull Request resolved: pytorch#130868
Approved by: https://github.com/drisspg, https://github.com/chenyang78, https://github.com/desertfire
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants