Skip to content

Conversation

@qjia7
Copy link
Contributor

@qjia7 qjia7 commented Oct 17, 2024

Description

Motivation and Context

@qjia7 qjia7 changed the title [WIP][webgpu-native] opt matmulnbits [webgpu-native] opt matmulnbits Oct 17, 2024
@qjia7 qjia7 marked this pull request as ready for review October 17, 2024 10:57
@qjia7
Copy link
Contributor Author

qjia7 commented Oct 17, 2024

All native cases can pass now. @guschmue @fs-eire Please take a look, thanks.

@qjia7 qjia7 requested a review from fs-eire October 21, 2024 01:50
@guschmue
Copy link
Contributor

also tested and did some perf comparison - yeah, it's good on Xe: Phi3 on tlk token/sec went up from 8.5 -> 13.1

@fs-eire fs-eire merged commit d312b38 into microsoft:fs-eire/webgpu-ep Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants