Skip to content

Phi3 MoE cuda kernel#21819

Merged
wangyems merged 10 commits intomainfrom
wangye/phi3_moe
Aug 27, 2024
Merged

Phi3 MoE cuda kernel#21819
wangyems merged 10 commits intomainfrom
wangye/phi3_moe

Conversation

@wangyems
Copy link
Copy Markdown
Contributor

Description

Motivation and Context

Comment thread onnxruntime/test/python/transformers/test_parity_phi3_moe.py Fixed
Comment thread onnxruntime/test/python/transformers/test_parity_phi3_moe.py Fixed
Comment thread onnxruntime/test/python/transformers/test_parity_phi3_moe.py Fixed
Comment thread onnxruntime/test/python/transformers/test_parity_phi3_moe.py Fixed
Comment thread onnxruntime/test/python/transformers/test_parity_phi3_moe.py Fixed
Comment thread docs/ContribOperators.md Outdated
Comment thread docs/ContribOperators.md
@wangyems wangyems marked this pull request as draft August 22, 2024 20:10
@wangyems wangyems marked this pull request as ready for review August 22, 2024 20:26
@wangyems wangyems requested a review from tianleiwu August 22, 2024 21:00
Comment thread onnxruntime/test/python/transformers/test_parity_phi3_moe.py Outdated
original gemm size causes out-of-SMEM for grouped gemm with Windows GPU pipeline
@tianleiwu
Copy link
Copy Markdown
Contributor

tianleiwu commented Aug 23, 2024

Test failed in A10:
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1472103&view=logs&j=6df8fe70-7b8f-505a-8ef0-8bf93da2bac7&t=4f6ef737-111d-50d1-a46b-5f86d9a970bc&l=27022
ort_fastertransformer::generic_moe_gemm_kernelLauncher occupancy > 0 was false. GPU lacks the shared memory resources to run GroupedGEMM kernel

Maybe tune parameters for GroupedGEMM for different device?

Comment on lines +1001 to +1002
# if platform.system() == "Windows":
# pytest.skip("Skip on Windows")

Check notice

Code scanning / CodeQL

Commented-out code

This comment appears to contain commented-out code.
@wangyems wangyems requested a review from tianleiwu August 26, 2024 17:00
@wangyems wangyems merged commit 1d059b8 into main Aug 27, 2024
@wangyems wangyems deleted the wangye/phi3_moe branch August 27, 2024 16:21
prathikr pushed a commit that referenced this pull request Aug 27, 2024
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Your Name <you@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants