Skip to content

Optimize w8a8 quantized matmul kernel#9412

Merged
yaochengji merged 7 commits intomasterfrom
xiowei/update_quantized_matmul_kernel
Jul 1, 2025
Merged

Optimize w8a8 quantized matmul kernel#9412
yaochengji merged 7 commits intomasterfrom
xiowei/update_quantized_matmul_kernel

Conversation

@vanbasten23
Copy link
Copy Markdown
Collaborator

@vanbasten23 vanbasten23 commented Jun 26, 2025

This PR

  • updated the block table.
  • fall back to xla w8a8 quantized matmul if the block sizes are not found.

Test plan:

  • pytest pytorch/xla/test/test_quantized_matmul_pallas_kernel.py -s
  • python pytorch/xla/test/test_pallas.py -k test_quantized_matmul_int8

@vanbasten23 vanbasten23 marked this pull request as ready for review June 27, 2025 23:20
Copy link
Copy Markdown
Collaborator

@yaochengji yaochengji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Xiongfei for your conribution!

Comment thread test/test_pallas.py Outdated
Comment thread torch_xla/experimental/custom_kernel.py
@vanbasten23 vanbasten23 requested a review from yaochengji June 28, 2025 02:05
Copy link
Copy Markdown
Collaborator

@yaochengji yaochengji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the contribution!

@vanbasten23
Copy link
Copy Markdown
Collaborator Author

The CI has been failing prior to this PR (eg #9415) and seems irrelevant to this PR.

@vanbasten23
Copy link
Copy Markdown
Collaborator Author

I also created an empty change #9430 and the same CI also fails.

@yaochengji yaochengji merged commit 4101ea5 into master Jul 1, 2025
40 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants