Add Int4PlainInt32Tensor by liangan1 · Pull Request #2845 · pytorch/ao

liangan1 · 2025-08-22T01:56:30Z

This PR is used to enable the Int4PlainInt32Tensor. The pacing format name is "plain_int32"
Testcase:
bash test/quantization/quantize_/workflows/int4/test_int4_plain_int32_tensor.py

pytorch-bot · 2025-08-22T01:56:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2845

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 78f6bb2 with merge base 568c193 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-08-26T00:11:03Z

+    "int4_xpu_int_zp is referring to the format used by int4 weight-only quantization on XPU with int zero point, which is a groupwise quantization format."
+    INT4_XPU_INT_ZP = "int4_xpu_int_zp"


please don't include int4 and xpu in the name, can you name this in terms of of how the quantized data is packed?

The int4 weight xpu is a plain format tensor according to this doc, it just pack 2 int4 weight elements in a byte and then store the 4*int4 as int32. So I change it to the plain.

I see, we have plain that stores 2*int4 as int8, can you reuse it or would need a new one? https://github.com/pytorch/ao/blob/main/torchao/quantization/quantize_/workflows/int4/int4_tensor.py

@liangan1 can you use PLAIN_INT32 for packing_format, and rename things accordingly (tensor subclass, files etc.)

Thanks @jerryzh168. I have added PLAIN_INT32 to be used by the xpu int4. Per my understanding, the packing format should be a dispatch policy to select the right tensor subclassing and a tensor subclass should cover a specific quantization recipe. So I suppose I should keep the current tensor name for int4 xpu.
In this PR, we just want to enable the int xpu with int zp domain. The current oneDNN backend can not support the float zp as CUDA/CPU backend and the feature is WIP. I plain to reuse this packing format in the future and dispatch the tensor with the zero point domain information.

you can reuse the packing format and the tensor for float32 zero_point as well in the future I think, but today we structure tensor subclass by: dtype + packing_format, so Int4PlainInt32 might be better

Done. change it to Int4PlainInt32. pls help to review again.

jerryzh168 · 2025-08-29T02:03:21Z

+    return Int4WeightOnlyConfig(
+        group_size=group_size,
+        packing_format="plain_int32",
+        zero_point_domain=ZeroPointDomain.INT,


nit: we don't need this anymore I think, also we want to remove ZeroPointDomain in the future

Removed. But I have a question that how to selelct the int zp domain for user if there is no this param?

we'll know how to quantize based on the type of tensor, so user just need to choose the packing_format

jerryzh168 · 2025-08-29T23:25:18Z

please rebase, and also fix the CI error as well, need to skip the test when there is no xpu I think

maybe update the Summary to make sure the naming are correct as well

liangan1 · 2025-09-01T02:20:21Z

cc @xiaowangintel

liangan1 · 2025-09-01T05:09:57Z

please rebase, and also fix the CI error as well, need to skip the test when there is no xpu I think

maybe update the Summary to make sure the naming are correct as well

Done. @jerryzh168 pls help review again.

jerryzh168 · 2025-09-02T17:45:07Z

+
+@unittest.skipIf(not torch_version_at_least("2.8.0"), "Need pytorch 2.8+")
+@unittest.skipIf(not torch.xpu.is_available(), "XPU not available")
+class Int4PlainInt32Tensor(TestCase):


we probably need more tests like serailization etc. but can add these later

Ok. we are working on the XPU CI enabling in other PRs. Pls refer to #2917

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 22, 2025

liangan1 added 3 commits August 22, 2025 09:55

Add Int4XPUTensorIntZP

ec3e065

Add int4_xpu_tensor

1dc5b2c

Update int4_xpu_tensor.py

e63b100

liangan1 requested review from andrewor14 and jerryzh168 August 25, 2025 07:55

liangan1 added the topic: new feature Use this tag if this PR adds a new feature label Aug 25, 2025

liangan1 changed the title ~~[WIP]Add Int4XPUTensorIntZP~~ Add Int4XPUTensorIntZP Aug 25, 2025

liangan1 added 4 commits August 25, 2025 15:55

Fix typo

5ef1ca2

Fix code format issue

a28dd89

fix bug

8a0f124

Fix code format

a0ff36f

jerryzh168 reviewed Aug 26, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_xpu_tensor.py Outdated

liangan1 added 5 commits August 26, 2025 09:46

Merge branch 'main' into liangan1/int4_xpu_int_zp

5e9c476

Update int4_xpu_tensor.py

2c4c2ce

change the pack format to plain

e48ea0b

fix typo

c4e5b9d

Update quant_api.py

7063e56

jerryzh168 reviewed Aug 28, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/__init__.py Outdated

jerryzh168 reviewed Aug 28, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_xpu_tensor.py

liangan1 added 4 commits August 28, 2025 07:31

merge main branch

5b87d8b

Merge branch 'main' into liangan1/int4_xpu_int_zp

6076877

Update __init__.py

8d2acd2

Update __init__.py

43acd66

jerryzh168 reviewed Aug 29, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_plain_int32_tensor.py Outdated

jerryzh168 reviewed Aug 29, 2025

View reviewed changes

Comment thread torchao/quantization/__init__.py Outdated

jerryzh168 reviewed Aug 29, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_plain_int32_tensor.py Outdated

jerryzh168 reviewed Aug 29, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_plain_int32_tensor.py Outdated

liangan1 requested a review from jerryzh168 August 29, 2025 02:05

liangan1 changed the title ~~Add Int4XPUTensorIntZP~~ Add Int4PlainInt32 Tensor Aug 29, 2025

jerryzh168 reviewed Aug 29, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_plain_int32_tensor.py Outdated

liangan1 requested a review from jerryzh168 August 29, 2025 06:09

liangan1 added 4 commits August 29, 2025 09:45

change Int4XPUTensorIntZP to Int4PlainInt32

a047c00

Update __init__.py

3f70b2b

Refine code

402dd72

Refine code

282f1a8

jerryzh168 reviewed Aug 29, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_plain_int32_tensor.py

jerryzh168 mentioned this pull request Aug 29, 2025

Migrating from AffineQuantizedTensor + Layouts to new structure of tensor subclasses #2752

Closed

17 tasks

liangan1 added 2 commits September 1, 2025 09:14

Update __init__.py

cd781fc

Update __init__.py

afadf69

liangan1 requested a review from jerryzh168 September 1, 2025 05:10

liangan1 changed the title ~~Add Int4PlainInt32 Tensor~~ Add Int4PlainInt32Tensor Sep 1, 2025

liangan1 added 6 commits September 1, 2025 09:13

Add more comments about the original weight dtype

b68beef

Merge branch 'main' into liangan1/int4_xpu_int_zp

66e05ff

fix code format issue

105b4b9

fix code format issue

b24ff1a

skip ut if no xpu

77868bc

Update test_int4_plain_int32_tensor.py

970aa17

jerryzh168 reviewed Sep 2, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/workflows/int4/int4_plain_int32_tensor.py

jerryzh168 reviewed Sep 2, 2025

View reviewed changes

jerryzh168 approved these changes Sep 2, 2025

View reviewed changes

jerryzh168 merged commit 9d01b43 into pytorch:main Sep 4, 2025
18 checks passed

Add assert for the original weight data type

78f6bb2

jerryzh168 mentioned this pull request Sep 11, 2025

Add torchao_convert to PARQ's QuantOptimizer #2947

Merged

		"int4_xpu_int_zp is referring to the format used by int4 weight-only quantization on XPU with int zero point, which is a groupwise quantization format."
		INT4_XPU_INT_ZP = "int4_xpu_int_zp"

Conversation

liangan1 commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2845

⏳ No Failures, 1 Pending

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liangan1 Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerryzh168 commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liangan1 commented Sep 1, 2025

Uh oh!

liangan1 commented Sep 1, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liangan1 commented Aug 22, 2025 •

edited

Loading

pytorch-bot Bot commented Aug 22, 2025 •

edited

Loading

liangan1 Aug 26, 2025 •

edited

Loading

jerryzh168 commented Aug 29, 2025 •

edited

Loading