enable smoothquant for int8 static tensor by jcaip · Pull Request #3468 · pytorch/ao

jcaip · 2025-12-08T21:21:17Z

This PR hooks up the static quant workflow added in #3442 to the prototype smoothquant API.

You can use the new flow like follows:

from torchao.quantization.quant_api import (
    Int8StaticActivationInt8WeightConfig,
)
from torchao.prototype.smoothquant import (
    SmoothQuantConfig
)

config = SmoothQuantConfig(
            base_config=Int8StaticActivationInt8Weight(granularity=PerRow()),
            step=SmoothQuantStep.PREPARE,
            alpha=0.5,
        )

quantize_(model, config)

# Perform calibration with test data
model(*x)

config.step = SmoothQuantStep.CONVERT
quantize_(model, config)

# model will now be statically quantized with the inputs used in smoothquant observer. 
model(*x)

Test Plan:

pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py
pytest test/prototype/test_smoothquant.py

Summary: This PR creates a new Int8Tensor and updates the configs to use the new Int8Tensor flow Test Plan: To ensure BC: ``` pytest test/quantization/test_quant_api.py ``` To test new Int8Tensor: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py ``` Reviewers: Subscribers: Tasks: Tags:

jerryzh168 · 2025-12-18T00:02:15Z

    else:
        raise ValueError(f"Unexpected step: {step}")

+    if isinstance(base_config, Int8StaticActivationInt8WeightConfig):


I think we shouldn't have specific config here, maybe change this to a similar protocol like SupportsActivationPreScaling for config?

I think figuring out how to do this generally will need a bit more design, we'd need to figure out how to map to the appropriate QuantizeTensorToInt/FloatXKwargs object. Agree we should be able to do this though, but can I address in a later PR?

we can get this info from a callback I think, like base_config.get_activation_quant_kwargs(), I feel you can iterating from that, but not from the current state

jerryzh168 · 2025-12-19T01:39:52Z

            block_size,
            self.dtype,
            act_quant_kwargs=self.act_quant_kwargs,
+            act_scale=self.act_scale,


I guess slice doesn't work for static quant int8 before, can you add a test for that?

jerryzh168 · 2025-12-19T01:40:45Z

        old_int8_tensor.scale[index],
        old_int8_tensor.block_size[1:],
        old_int8_tensor.dtype,
+        old_int8_tensor.act_scale,


same for this one, seems like select op breaks before with static quant

For slice and select, they work when granularity is PerDim(=-1), but not otherwise. we now throw an exception when passing PerDim != -1 and I added a test for both.

This PR hooks up the static quant workflow added in #3442 to the prototype smoothquant API. You can use the new flow like follows: ```python from torchao.quantization.quant_api import ( Int8StaticActivationInt8WeightConfig, ) from torchao.prototype.smoothquant import ( SmoothQuantConfig ) config = SmoothQuantConfig( base_config=Int8StaticActivationInt8Weight(granularity=PerRow()), step=SmoothQuantStep.PREPARE, alpha=0.5, ) quantize_(model, config) # Perform calibration with test data model(*x) config.step = SmoothQuantStep.CONVERT quantize_(model, config) # model will now be statically quantized with the inputs used in smoothquant observer. model(*x) ```

jerryzh168 · 2026-01-20T21:16:20Z

+    if isinstance(base_config, Int8StaticActivationInt8WeightConfig):
+        base_config.static_scale = activation_scale


can do the same as SupportsActivationPreScaling, like IsStaticQuantizationConfig

jerryzh168

LGTM, thanks!

jerryzh168

LGTM, thanks!

* Int8Tensor migration Summary: This PR creates a new Int8Tensor and updates the configs to use the new Int8Tensor flow Test Plan: To ensure BC: ``` pytest test/quantization/test_quant_api.py ``` To test new Int8Tensor: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py ``` Reviewers: Subscribers: Tasks: Tags: * ruff fixes * add init * fix ruff again * update * wip * undo update tests * fix ruff * fix varname * fix typing * add tests * fix dtype * fix ci * address granularity cr * update _choose_quant_func_and_quantize_tensor * make block size required attribute * made dtype required as well * address nits * skip per tensor weight only test for now * add static quant * add static quant * update * static quant working eager + compile * remove file * added asserts * undo smoothquant change * fix return * got smoothquant + int8 static working * generalized smoothquat code * free tests * fix static scale check * update * address cr feedback * Hook up static quant workflow to prototype smoothquant API This PR hooks up the static quant workflow added in #3442 to the prototype smoothquant API. You can use the new flow like follows: ```python from torchao.quantization.quant_api import ( Int8StaticActivationInt8WeightConfig, ) from torchao.prototype.smoothquant import ( SmoothQuantConfig ) config = SmoothQuantConfig( base_config=Int8StaticActivationInt8Weight(granularity=PerRow()), step=SmoothQuantStep.PREPARE, alpha=0.5, ) quantize_(model, config) # Perform calibration with test data model(*x) config.step = SmoothQuantStep.CONVERT quantize_(model, config) # model will now be statically quantized with the inputs used in smoothquant observer. model(*x) ``` * fix ruff * fix test to use threshold for sqnr

jcaip added 30 commits December 1, 2025 12:55

ruff fixes

0b73aed

add init

1e49945

fix ruff again

669b6ee

update

9071526

wip

1539e0f

Merge branch 'main' into jcaip/int8-tensor

d9a2b1b

undo update tests

673f228

fix ruff

739fd64

fix varname

750db1a

fix typing

9410488

add tests

45a3a76

fix dtype

4e2f09c

fix ci

dd80cca

address granularity cr

7f73062

update _choose_quant_func_and_quantize_tensor

ac6a2b6

make block size required attribute

f28df4a

made dtype required as well

328585e

address nits

ce4d568

skip per tensor weight only test for now

a665d45

add static quant

0338016

add static quant

ee39691

update

9eb0aa9

static quant working eager + compile

d4a1514

remove file

3cdea56

added asserts

fa9022d

undo smoothquant change

8ce5cde

fix return

6f64121

Merge branch 'main' into jcaip/static-quant-rebased

8ae921d

got smoothquant + int8 static working

5b9e243

jerryzh168 reviewed Dec 18, 2025

View reviewed changes

Comment thread test/quantization/quantize_/workflows/int8/test_int8_tensor.py Outdated

jerryzh168 reviewed Dec 18, 2025

View reviewed changes

jcaip force-pushed the jcaip/enable-smoothquant branch from f389a94 to 2586ab6 Compare December 18, 2025 00:02

jerryzh168 reviewed Dec 19, 2025

View reviewed changes

cyxlily mentioned this pull request Dec 22, 2025

Separately control the activation quantization granularity #3524

Merged

cyxlily reviewed Dec 23, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int8/int8_tensor.py Outdated

jcaip force-pushed the jcaip/enable-smoothquant branch from ea9b8e2 to c3abc46 Compare January 20, 2026 20:45

jcaip force-pushed the jcaip/enable-smoothquant branch from c3abc46 to 7b45e3e Compare January 20, 2026 20:53

Merge branch 'main' into jcaip/enable-smoothquant

973f396

jcaip requested a review from jerryzh168 January 20, 2026 21:06

jerryzh168 reviewed Jan 20, 2026

View reviewed changes

Comment thread test/quantization/quantize_/workflows/int8/test_int8_tensor.py

jerryzh168 reviewed Jan 20, 2026

View reviewed changes

Comment thread torchao/quantization/quant_api.py Outdated

jerryzh168 reviewed Jan 20, 2026

View reviewed changes

Comment thread torchao/quantization/quant_api.py Outdated

fix ruff

2f05d2c

jcaip force-pushed the jcaip/enable-smoothquant branch from 8764cbe to 2f05d2c Compare January 20, 2026 22:23

fix test to use threshold for sqnr

1bc5027

namgyu-youn mentioned this pull request Jan 21, 2026

Remove external wrapper in SmoothQuant #3591

Closed

jerryzh168 approved these changes Jan 21, 2026

View reviewed changes

jcaip enabled auto-merge (squash) January 21, 2026 17:30

jcaip disabled auto-merge January 21, 2026 17:30

jcaip enabled auto-merge (squash) January 21, 2026 17:32

jcaip disabled auto-merge January 21, 2026 17:53

jcaip closed this Jan 21, 2026

jcaip reopened this Jan 21, 2026

jcaip merged commit c6bc74c into main Jan 21, 2026
25 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable smoothquant for int8 static tensor#3468

enable smoothquant for int8 static tensor#3468
jcaip merged 41 commits into
mainfrom
jcaip/enable-smoothquant

jcaip commented Dec 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

jerryzh168 Dec 18, 2025 •

edited

Loading

Uh oh!

jcaip Dec 19, 2025

Uh oh!

jerryzh168 Jan 20, 2026 •

edited

Loading

Uh oh!

jerryzh168 Dec 19, 2025

Uh oh!

jerryzh168 Dec 19, 2025

Uh oh!

jcaip Jan 20, 2026

Uh oh!

Uh oh!

Uh oh!

jerryzh168 Jan 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jerryzh168 left a comment

Uh oh!

jerryzh168 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if isinstance(base_config, Int8StaticActivationInt8WeightConfig):
		base_config.static_scale = activation_scale

Conversation

jcaip commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jerryzh168 Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcaip Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

jcaip Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jerryzh168 Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jcaip commented Dec 8, 2025 •

edited

Loading

jerryzh168 Dec 18, 2025 •

edited

Loading

jerryzh168 Jan 20, 2026 •

edited

Loading

jerryzh168 Jan 20, 2026 •

edited

Loading