Skip to content

Remove the clamp op when we do symmetric quantization on a tensor#9465

Merged
vanbasten23 merged 3 commits intomasterfrom
xiowei/optimize_quantized_matmul
Jul 10, 2025
Merged

Remove the clamp op when we do symmetric quantization on a tensor#9465
vanbasten23 merged 3 commits intomasterfrom
xiowei/optimize_quantized_matmul

Conversation

@vanbasten23
Copy link
Copy Markdown
Collaborator

@vanbasten23 vanbasten23 commented Jul 9, 2025

Originally, when we do symmetric quantization, we do

scale = max(abs(x)) / 127
q_x = clamp(round(x / scale), -128, 127)

x/scale = x / (max(abs(x)) / 127) = (x / max(abs(x)) * 127

Mathematically, we know that x / max(abs(x) always returns the range of -1 to 1, meaning x / scale always returns the range of -127 to 127. Therefore, torch.clamp can be skipped.

Idea credit to Kyuyeun Kim @kyuyeunk

Test plan:

  • python pytorch/xla/test/quantized_ops/test_quantized_matmul.py
  • pytest pytorch/xla/test/test_quantized_matmul_pallas_kernel.py -s

@vanbasten23 vanbasten23 requested a review from lsy323 July 9, 2025 22:20
@vanbasten23 vanbasten23 force-pushed the xiowei/optimize_quantized_matmul branch from 373f791 to 69fd267 Compare July 9, 2025 22:22
Comment thread test/quantized_ops/test_quantized_matmul.py
@vanbasten23 vanbasten23 marked this pull request as ready for review July 9, 2025 22:25
@lsy323
Copy link
Copy Markdown
Collaborator

lsy323 commented Jul 9, 2025

Looks like the tolerance of the numerical error is relaxed in a few tests, so this implies the new impl is generating bigger error?

@vanbasten23
Copy link
Copy Markdown
Collaborator Author

Looks like the tolerance of the numerical error is relaxed in a few tests, so this implies the new impl is generating bigger error?

Good question. I explained here #9465 (comment). I first run the tests without any changes on v5e and the test test/quantized_ops/test_quantized_matmul.py failed. So I made the necessary to make the test pass. Then I made the non-test code changes.

@vanbasten23 vanbasten23 force-pushed the xiowei/optimize_quantized_matmul branch from 69fd267 to 268b19b Compare July 10, 2025 00:01
@vanbasten23
Copy link
Copy Markdown
Collaborator Author

Thanks for the review!

@vanbasten23 vanbasten23 merged commit cf156c6 into master Jul 10, 2025
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants