Add module fqn regex support for ModuleFqnToConfig by jerryzh168 · Pull Request #3084 · pytorch/ao

jerryzh168 · 2025-09-26T21:37:33Z

Summary:
To simplify the config file for torchao quantized models we want to allow people to configure the ModuleFqnToConfig through regex, e.g. re:linear.+, re:language.layers.mlp\..+\.gate_proj

Note: this does not change the previous behavior of specifying full fqns, the current supported configurations:

full fqn
fqn regex (start with re:)
"_default"

and 1 takes precedence over 2. e.g. for a model with linear1 and linear2 submodules, if we have:

config = ModuleFqnToConfig({"linear1": config1, "re:linear.+": config2, "_default": config3})
quantize(m, config)

then m.linear1 will have config1 and m.linear2 will have config2, and all other modules will have config3

Note: changing the type of dict from Dict to OrderedDict is not bc breaking, tested with https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev#test-loading (produced before the change) and it still works

Test Plan:

unit tests

new regex support
pytest test/quantization/test_quant_api.py -k test_module_fqn_to_config_regex
regression tests for old module fqn behavior: pytest test/quantization/test_quant_api.py -k test_module_fqn_to_config

e2e test

transformer PR: [torchao] Add regex support for ModuleFqnToConfig huggingface/transformers#41242
transformer checkpoint: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev

tested loading and run in transformers: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev#test-loading

vllm PR: [torchao] Add support for ModuleFqnToConfig using regex vllm-project/vllm#26001

tested with loading the transformer configs and run in vllm

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-09-26T21:37:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3084

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit de999a5 with merge base 7690612 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2025-09-27T00:26:56Z

does it work for quantizing a HuggingFace model and then running it in vLLM?

jerryzh168 · 2025-09-27T00:35:54Z

does it work for quantizing a HuggingFace model and then running it in vLLM?

not yet, we need follow up changes in these repos to handle this, i.e. https://github.com/huggingface/transformers/blob/071eb5334f5a9ac2c7a13515219be8a272388ec6/src/transformers/quantizers/quantizer_torchao.py#L302 and https://github.com/vllm-project/vllm/blob/8bf8f4582208ac7af230512ff5f3ac1dc36d5222/vllm/model_executor/layers/quantization/torchao.py#L126

vkuzo · 2025-09-29T11:17:15Z

I'd love to see the following in this PR:

a user facing description of what the new proposed syntax is. If this changes the FQN to always be a regex, it would be good to also have either proof that this does not change existing behavior, or a list of what specific existing behavior is changed and validation that changing it and breaking BC is acceptable.
draft PRs in any repos affected by this change (huggingface transformers, vLLM) demonstrating that the e2e behavior will work as intended

IMO it's hard to know if this PR is landable without having the two things above

Summary: Similar to pytorch/ao#3084 we added regex support in transformers so people can use regex to quantize the models. See pytorch/ao#3084 for docs and precedence of different configurations Uploaded model: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev Test Plan: pytest tests/quantization/torchao_integration/test_torchao.py -k test_module_fqn_to_config_regex Reviewers: Subscribers: Tasks: Tags:

jerryzh168 · 2025-10-01T04:07:58Z

@vkuzo just updated the PR to include more docs and details on precedence of configs and please check out the summary for the e2e tests in transformers and vllm. Please take a look again

vkuzo · 2025-10-01T13:31:56Z

+
+         Config key ordered by precedence:
+           * fully qualified module name, e.g. `language.layers.0.q_proj`
+           * regex for module names, e.g. `language.layers.*.q_proj`


clarify that regexes will be matched in the order that they appear in the dictionary

will do, also I think we need to change to OrderedDict to keep the order consistent

jerryzh168 · 2025-10-01T18:44:56Z

Updates:

addressed comments from @vkuzo
updated type of module_fqn_to_config from Dict to OrderedDict, checked that it still works with https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev#test-loading giving the same outputs

jcaip · 2025-10-03T02:10:31Z

-        # fallback to use default if no module specific config is provided
-        c = config.module_fqn_to_config.get("_default", None)
+        for maybe_module_fqn_pattern in config.module_fqn_to_config:
+            if re.search(maybe_module_fqn_pattern, module_fqn):


Thoughts on changing this logic from any match in the string, to must match the entire string?

I.e. for some regex r, we actually do re.search(^r$, module_fqn). Reason I'm proposing this is because for gpt-oss we have gate_up_proj and gate_up_proj_bias, and only one should be quantized.

should this be done in the regex itself? like ...gate_up_proj$

yeah I'm just a little worried people will be dumb like me and accidentally quantize both, but im fine with how it is now too.

I could add a note in the doc I guess, although I feel we should not change the meaning of the regex in the code, but leave this to user instead

Reason I'm proposing this is because for gpt-oss we have gate_up_proj and gate_up_proj_bias, and only one should be quantized.

that's a good point. If the user specifies gate_up_proj, seems like with the current logic it would apply an exact match to gate_up_proj, and a regex match to gate_up_proj_bias? That's definitely a gotcha, IMO we should ensure every key is either a regex or an exact match to remove the ambiguity.

sorry I think I meant to use fullmatch function, not search (https://docs.python.org/3/library/re.html#search-vs-match)

but discussed offline that we want to go with explicitly calling out regex configs. I plan to use a prefix of re:, same as llm-compressor

jcaip

lgtm, ill do the same logic in #3084

vkuzo

I think we should figure out how to address @jcaip's comment

Summary: Similar to pytorch/ao#3084 we added regex support in transformers so people can use regex to quantize the models. See pytorch/ao#3084 for docs and precedence of different configurations Uploaded model: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev Test Plan: pytest tests/quantization/torchao_integration/test_torchao.py -k test_module_fqn_to_config_regex Reviewers: Subscribers: Tasks: Tags:

Summary: To simplify the config file for torchao quantized models we want to allow people to configure the ModuleFqnToConfig through regex, e.g. `linear*`, `language.layers.*.gate_proj` Test Plan: python test/quantization/test_quant_api.py -k test_module_fqn_to_config_module_name_regex Reviewers: Subscribers: Tasks: Tags:

Summary: Similar to pytorch/ao#3084 we added regex support in transformers so people can use regex to quantize the models. See pytorch/ao#3084 for docs and precedence of different configurations Uploaded model: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev Test Plan: pytest tests/quantization/torchao_integration/test_torchao.py -k test_module_fqn_to_config_regex Reviewers: Subscribers: Tasks: Tags:

please take a look again

vkuzo · 2025-10-06T16:03:06Z

@@ -2409,8 +2405,16 @@ def _module_fqn_to_config_handler(
        # Maybe: we can add module type specific config in the future, in needed
        c = config.module_fqn_to_config[module_fqn]


nit: assert that the pattern does not start with re:, for clarity

I can update this in my PR.

vkuzo · 2025-10-06T16:03:52Z

thank you!

* Add regex support for ModuleFqnToConfig Summary: Similar to pytorch/ao#3084 we added regex support in transformers so people can use regex to quantize the models. See pytorch/ao#3084 for docs and precedence of different configurations Uploaded model: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev Test Plan: pytest tests/quantization/torchao_integration/test_torchao.py -k test_module_fqn_to_config_regex Reviewers: Subscribers: Tasks: Tags: * Apply style fixes * add assert for --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

…ng regex Summary: att, we are adding regex support to simplify the config, and enabling the support in both transformers and vllm to make sure regex config works everywhere torchao PR that adds the functionality to quantize_ API: pytorch/ao#3084 transformer PR: Test Plan: We save the model with the regex config in transformers, in vllm we just make sure we can load the model: pytest tests/quantization/test_torchao.py test_opt_125m_module_fqn_to_config_regex_model_loading_with_params Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>

* Add regex support for ModuleFqnToConfig Summary: Similar to pytorch/ao#3084 we added regex support in transformers so people can use regex to quantize the models. See pytorch/ao#3084 for docs and precedence of different configurations Uploaded model: https://huggingface.co/torchao-testing/opt-125m-ModuleFqnToConfig-v1-regex-0.14.0.dev Test Plan: pytest tests/quantization/torchao_integration/test_torchao.py -k test_module_fqn_to_config_regex Reviewers: Subscribers: Tasks: Tags: * Apply style fixes * add assert for --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 26, 2025

jerryzh168 added topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Sep 26, 2025

jerryzh168 requested review from andrewor14, jcaip, metascroy and vkuzo September 26, 2025 21:37

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 26, 2025

jerryzh168 force-pushed the module-fqn-to-config-regex branch from f138154 to eee92e6 Compare September 29, 2025 20:30

jerryzh168 mentioned this pull request Oct 1, 2025

[torchao] Add regex support for ModuleFqnToConfig huggingface/transformers#41242

Merged

jerryzh168 mentioned this pull request Oct 1, 2025

[torchao] Add support for ModuleFqnToConfig using regex vllm-project/vllm#26001

Merged

vkuzo reviewed Oct 1, 2025

View reviewed changes

Comment thread test/quantization/test_quant_api.py Outdated

vkuzo reviewed Oct 1, 2025

View reviewed changes

Comment thread test/quantization/test_quant_api.py Outdated

vkuzo reviewed Oct 1, 2025

View reviewed changes

Comment thread torchao/quantization/quant_api.py Outdated

vkuzo reviewed Oct 1, 2025

View reviewed changes

jerryzh168 force-pushed the module-fqn-to-config-regex branch 2 times, most recently from 4ad3bd1 to 8c70326 Compare October 1, 2025 18:41

jerryzh168 requested a review from vkuzo October 1, 2025 19:40

jerryzh168 force-pushed the module-fqn-to-config-regex branch 2 times, most recently from 32b8793 to 9b5f773 Compare October 2, 2025 00:53

jcaip reviewed Oct 3, 2025

View reviewed changes

jcaip approved these changes Oct 3, 2025

View reviewed changes

vkuzo previously requested changes Oct 3, 2025

View reviewed changes

jcaip mentioned this pull request Oct 3, 2025

Add quantize_ nn.Parameter support #3083

Merged

jerryzh168 force-pushed the module-fqn-to-config-regex branch from 9b5f773 to 97e48e0 Compare October 3, 2025 18:00

jerryzh168 force-pushed the module-fqn-to-config-regex branch 2 times, most recently from e724d12 to c3c249b Compare October 3, 2025 22:51

jerryzh168 force-pushed the module-fqn-to-config-regex branch from c3c249b to de999a5 Compare October 3, 2025 22:52

jerryzh168 requested a review from vkuzo October 3, 2025 23:56

vkuzo reviewed Oct 6, 2025

View reviewed changes

vkuzo approved these changes Oct 6, 2025

View reviewed changes

jcaip merged commit 5346f0e into pytorch:main Oct 6, 2025
18 checks passed

jcaip mentioned this pull request Oct 22, 2025

Deprecation for _default in FqnToConfig #3229

Open

		@@ -2409,8 +2405,16 @@ def _module_fqn_to_config_handler(
		# Maybe: we can add module type specific config in the future, in needed
		c = config.module_fqn_to_config[module_fqn]

Conversation

jerryzh168 commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

unit tests

e2e test

Uh oh!

pytorch-bot Bot commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3084

✅ No Failures

Uh oh!

vkuzo commented Sep 27, 2025

Uh oh!

jerryzh168 commented Sep 27, 2025

Uh oh!

vkuzo commented Sep 29, 2025

Uh oh!

jerryzh168 commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 commented Oct 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcaip Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcaip left a comment

Choose a reason for hiding this comment

Uh oh!

vkuzo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vkuzo commented Oct 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jerryzh168 commented Sep 26, 2025 •

edited

Loading

pytorch-bot Bot commented Sep 26, 2025 •

edited

Loading

jerryzh168 commented Oct 1, 2025 •

edited

Loading

jerryzh168 Oct 3, 2025 •

edited

Loading

jcaip Oct 3, 2025 •

edited

Loading