Add module swap quantization API from Quanty by andrewor14 · Pull Request #1886 · pytorch/ao

andrewor14 · 2025-03-13T20:27:27Z

Summary: This commit adds a module-swap-based PTQ API from Quanty, including:

Quantized linear and embedding modules
IntQuantizer to specify how to quantize weights and activations
CodeBookQuantizer as an alternative to IntQuantizer
Implementation of K-means to be used for codebook quantization
Range setting and data getter utility

These new APIs will complement our existing quantize_ API, which is primarily used for tensor-subclass-based quantization today (though it can also support module swaps). All APIs introduced in this commit are under prototype and highly subject to change. In particular, we plan to delete quantize_module_swap and QuantizationRecipe, and instead integrate this flow with the quantize_ API by creating a new AOBaseConfig.

All code is migrated from Quanty and written by @TiRune.

Test Plan:
python test/quantization/module_swap/test_*

pytorch-bot · 2025-03-13T20:27:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1886

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 5001ba6 with merge base 8c81863 ():

NEW FAILURE - The following job has failed:

Run TorchAO Experimental Tests / test-mps-ops (macos-m1-stable) (gh)
ModuleNotFoundError: No module named 'importlib_metadata'

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2025-03-14T13:11:18Z

@@ -0,0 +1,56 @@
+import copy


IMO these tests should be in test/prototype/quantization/module_swap

vkuzo

a short README.md of prototype/module_swap_quantization would be nice as well

@TiRune

**Summary:** This commit adds a module-swap-based PTQ API from Quanty, including: - Quantized linear and embedding modules - `IntQuantizer` to specify how to quantize weights and activations - `CodeBookQuantizer` as an alternative to IntQuantizer - Implementation of K-means to be used for codebook quantization - Range setting and data getter utility These new APIs will complement our existing `quantize_` API, which is primarily used for tensor-subclass-based quantization today (though it can also support module swaps). All APIs introduced in this commit are under prototype and highly subject to change. In particular, we plan to delete `quantize_module_swap` and `QuantizationRecipe`, and instead integrate this flow with the `quantize_` API by creating a new `AOBaseConfig`. All code is migrated from Quanty and written by @TiRune. **Test Plan:** python test/quantization/module_swap/test_*

andrewor14 · 2025-03-14T22:06:34Z

Thanks, added the README. Merging this!

@TiRune

**Summary:** This commit adds a module-swap-based PTQ API from Quanty, including: - Quantized linear and embedding modules - `IntQuantizer` to specify how to quantize weights and activations - `CodeBookQuantizer` as an alternative to IntQuantizer - Implementation of K-means to be used for codebook quantization - Range setting and data getter utility These new APIs will complement our existing `quantize_` API, which is primarily used for tensor-subclass-based quantization today (though it can also support module swaps). All APIs introduced in this commit are under prototype and highly subject to change. In particular, we plan to delete `quantize_module_swap` and `QuantizationRecipe`, and instead integrate this flow with the `quantize_` API by creating a new `AOBaseConfig`. All code is migrated from Quanty and written by @TiRune. **Test Plan:** python test/quantization/module_swap/test_*

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 13, 2025

andrewor14 added the topic: new feature Use this tag if this PR adds a new feature label Mar 13, 2025

andrewor14 force-pushed the module-swap-ptq branch 2 times, most recently from bf0a842 to 715d6cd Compare March 13, 2025 20:33

vkuzo reviewed Mar 14, 2025

View reviewed changes

andrewor14 force-pushed the module-swap-ptq branch 4 times, most recently from 31bd9f1 to 38b1361 Compare March 14, 2025 15:35

vkuzo approved these changes Mar 14, 2025

View reviewed changes

andrewor14 force-pushed the module-swap-ptq branch 6 times, most recently from 2ed4d2d to 16022cf Compare March 14, 2025 18:12

andrewor14 force-pushed the module-swap-ptq branch from 16022cf to 5001ba6 Compare March 14, 2025 19:03

andrewor14 merged commit 8c8388d into main Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add module swap quantization API from Quanty#1886

Add module swap quantization API from Quanty#1886
andrewor14 merged 1 commit into
mainfrom
module-swap-ptq

andrewor14 commented Mar 13, 2025

Uh oh!

pytorch-bot Bot commented Mar 13, 2025 •

edited

Loading

Uh oh!

vkuzo Mar 14, 2025

Uh oh!

vkuzo left a comment

Uh oh!

andrewor14 commented Mar 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andrewor14 commented Mar 13, 2025

Uh oh!

pytorch-bot Bot commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1886

❌ 1 New Failure

Uh oh!

vkuzo Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

vkuzo left a comment

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Mar 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot Bot commented Mar 13, 2025 •

edited

Loading