Add all fbgemm kernel Tensors into Int4WeightOnlyConfig and Float8DynamicActivationInt4WeightConfig by jerryzh168 · Pull Request #2474 · pytorch/ao

jerryzh168 · 2025-07-02T01:58:39Z

Stacked PRs:

Support optional_tensor_names in TorchAOBaseTensor #2710
Align Int4Tensor implementation details with the design of Float8Tensor #2687
->Add all fbgemm kernel Tensors into Int4WeightOnlyConfig and Float8DynamicActivationInt4WeightConfig #2474
Check numerical equivalence / closeness between different kernel preferences #2651

Add all fbgemm kernel Tensors into Int4WeightOnlyConfig and Float8DynamicActivationInt4WeightConfig

Summary:
we will

deprecate FbgemmConfig since it's a single kernel (later).
we'd like to categorize things to derived dtype + packed format, e.g. int4 preshuffled, float8 plain
Added PackingFormat that has preshuffled, plain in Version 2 of Int4WeightOnlyConfig, the older AQT tensor will remain in Version 1

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

…amicActivationInt4WeightConfig Summary: we will * deprecate FbgemmConfig since it's a single kernel (later). * we'd like to categorize things to derived dtype + packed format, e.g. int4 preshuffled, float8 plain * Added PackingFormat that has preshuffled, plain in Version 2 of Int4WeightOnlyConfig, the older AQT tensor will remain in Version 1 Test Plan: python test/quantization/quantize_/workflows/int4/test_int4_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py python test/quantization/quantize_/workflows/float8/test_float8_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2474, branch: jerryzh168/stack/10

pytorch-bot · 2025-07-02T01:58:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2474

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5bb2fd4 with merge base 1114ca0 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…micActivationInt4WeightConfig Summary: att, we will deprecate FbgemmConfig since it's a single kernel. we'd like to categorize things to derived dtype + packed format Test Plan: python test/quantization/quantize_/test_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2474, branch: jerryzh168/stack/10

…amicActivationInt4WeightConfig Summary: att, we will deprecate FbgemmConfig since it's a single kernel. we'd like to categorize things to derived dtype + packed format Test Plan: python test/quantization/quantize_/test_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2474, branch: jerryzh168/stack/10

andrewor14 · 2025-07-09T22:13:55Z

+class Float8ActivationInt4WeightConfig(AOBaseConfig):
+    group_size: int = 128
+    use_preshuffle: bool = False
+    kernel: str = "fbgemm"


Use GemmKernelChoice here?

also add a docstring?

updated the structure after some discussions, please take a look again

andrewor14 · 2025-07-09T22:17:52Z

@@ -44,6 +44,7 @@
 from .quant_api import (
    CutlassInt4PackedLayout,
    FbgemmConfig,


What's the plan for FbgemmConfig? Looks like it was added only ~1.5 months ago but it's technically public API. Do we know if anyone's using it already? I don't think it's released yet so wonder if it's OK to just remove it?

we'll remove it, it is used in some internal script but we'll update these as well

andrewor14

Looks great, left some comments mostly about documentation

andrewor14 · 2025-07-15T20:13:05Z

    preserve_zero: Optional[bool] = None
+    # since not all tensors are migrated to the new structure yet,
+    # we use `_legacy' to represent the previous layout
+    packing_format: PackingFormat = "_legacy"


Can "legacy" mean different things for different configs? I wonder if we should make this optional instead, where None represents "legacy"?

yeah legacy just means no packing format, (it's implemented with AQT), I plan to remove the support for legacy at some point and don't want to complicate the typing here

jerryzh168 force-pushed the jerryzh168/stack/10 branch from a3d0835 to 4b0c7c7 Compare July 2, 2025 01:58

This was referenced Jul 2, 2025

Add support for Int4GroupwisePreshuffleTensor for fbgemm #2421

Merged

Remove transpose_input from fbgemm configs #2422

Merged

Add support for float8 activation for Int4PreshuffledTensor #2437

Merged

Add Float8Tensor #2463

Merged

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 2, 2025

jerryzh168 added the topic: new feature Use this tag if this PR adds a new feature label Jul 2, 2025

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 2, 2025 20:35

jerryzh168 force-pushed the jerryzh168/stack/10 branch from 4b0c7c7 to f5977ce Compare July 2, 2025 20:36

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 2, 2025 20:36

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 2, 2025 21:42

jerryzh168 force-pushed the jerryzh168/stack/10 branch 2 times, most recently from 04ce2c5 to afd8703 Compare July 2, 2025 21:42

jerryzh168 mentioned this pull request Jul 2, 2025

Rename torchao.float8.Float8Tensor to torchao.float8.Float8TrainingTensor #2479

Merged

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 2, 2025 21:42

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 2, 2025 23:44

jerryzh168 force-pushed the jerryzh168/stack/10 branch from afd8703 to ff4682e Compare July 2, 2025 23:44

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 2, 2025 23:44

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 3, 2025 00:09

jerryzh168 force-pushed the jerryzh168/stack/10 branch from ff4682e to 58f8a2a Compare July 3, 2025 00:09

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 3, 2025 00:09

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 3, 2025 02:18

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 3, 2025 21:57

jerryzh168 force-pushed the jerryzh168/stack/10 branch from acc33bd to 7412903 Compare July 3, 2025 21:57

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 3, 2025 21:57

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 7, 2025 18:47

jerryzh168 force-pushed the jerryzh168/stack/10 branch from 7412903 to 867c75a Compare July 7, 2025 18:47

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 7, 2025 18:47

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 7, 2025 19:52

jerryzh168 force-pushed the jerryzh168/stack/10 branch from 867c75a to 382bb8a Compare July 7, 2025 19:52

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 7, 2025 19:52

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 7, 2025 19:57

jerryzh168 force-pushed the jerryzh168/stack/10 branch from 382bb8a to 8a6dcc4 Compare July 7, 2025 19:57

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 7, 2025 19:57

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 7, 2025 22:39

jerryzh168 force-pushed the jerryzh168/stack/10 branch from 8a6dcc4 to 5bf77b1 Compare July 7, 2025 22:39

jerryzh168 changed the base branch from main to jerryzh168/stack/9 July 7, 2025 22:39

jerryzh168 changed the base branch from jerryzh168/stack/9 to main July 7, 2025 23:14

jerryzh168 force-pushed the jerryzh168/stack/10 branch from 5bf77b1 to 6cb63be Compare July 7, 2025 23:14

andrewor14 reviewed Jul 9, 2025

View reviewed changes

andrewor14 approved these changes Jul 15, 2025

View reviewed changes

namgyu-youn mentioned this pull request Aug 10, 2025

replace FbgemmConfig with Int4WeightOnlyConfig #2727

Closed

Conversation

jerryzh168 commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!