Remove the clamp op when we do symmetric quantization on a tensor by vanbasten23 · Pull Request #9465 · pytorch/xla

vanbasten23 · 2025-07-09T20:53:44Z

Originally, when we do symmetric quantization, we do

scale = max(abs(x)) / 127
q_x = clamp(round(x / scale), -128, 127)

x/scale = x / (max(abs(x)) / 127) = (x / max(abs(x)) * 127

Mathematically, we know that x / max(abs(x) always returns the range of -1 to 1, meaning x / scale always returns the range of -127 to 127. Therefore, torch.clamp can be skipped.

Idea credit to Kyuyeun Kim @kyuyeunk

Test plan:

python pytorch/xla/test/quantized_ops/test_quantized_matmul.py
pytest pytorch/xla/test/test_quantized_matmul_pallas_kernel.py -s

lsy323 · 2025-07-09T22:40:33Z

Looks like the tolerance of the numerical error is relaxed in a few tests, so this implies the new impl is generating bigger error?

vanbasten23 · 2025-07-09T22:53:04Z

Looks like the tolerance of the numerical error is relaxed in a few tests, so this implies the new impl is generating bigger error?

Good question. I explained here #9465 (comment). I first run the tests without any changes on v5e and the test test/quantized_ops/test_quantized_matmul.py failed. So I made the necessary to make the test pass. Then I made the non-test code changes.

vanbasten23 · 2025-07-10T01:51:10Z

Thanks for the review!

vanbasten23 requested a review from lsy323 July 9, 2025 22:20

vanbasten23 force-pushed the xiowei/optimize_quantized_matmul branch from 373f791 to 69fd267 Compare July 9, 2025 22:22

vanbasten23 commented Jul 9, 2025

View reviewed changes

Comment thread test/quantized_ops/test_quantized_matmul.py

vanbasten23 marked this pull request as ready for review July 9, 2025 22:25

lsy323 approved these changes Jul 9, 2025

View reviewed changes

vanbasten23 added 3 commits July 10, 2025 00:01

Fix the unit tests. This commit doesnt change any service code

f9963bb

get rid of clamp

19b4d78

not to conver to bf16. also ran linter

268b19b

vanbasten23 force-pushed the xiowei/optimize_quantized_matmul branch from 69fd267 to 268b19b Compare July 10, 2025 00:01

vanbasten23 merged commit cf156c6 into master Jul 10, 2025
23 of 24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the clamp op when we do symmetric quantization on a tensor#9465

Remove the clamp op when we do symmetric quantization on a tensor#9465
vanbasten23 merged 3 commits intomasterfrom
xiowei/optimize_quantized_matmul

vanbasten23 commented Jul 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

lsy323 commented Jul 9, 2025

Uh oh!

vanbasten23 commented Jul 9, 2025

Uh oh!

vanbasten23 commented Jul 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vanbasten23 commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

lsy323 commented Jul 9, 2025

Uh oh!

vanbasten23 commented Jul 9, 2025

Uh oh!

vanbasten23 commented Jul 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vanbasten23 commented Jul 9, 2025 •

edited

Loading