Add parameter_name support to _int4_weight_only_transform by jcaip · Pull Request #3901 · pytorch/ao

jcaip · 2026-02-17T03:36:29Z

Stack from ghstack (oldest at bottom):

Enable parameter-level quantization by accepting a parameter_name kwarg,
using getattr/setattr instead of hard-coded module.weight, and switching
to _module_extra_repr with partial for flexible repr.

Co-authored-by: Cursor cursoragent@cursor.com

This config was deprecated in favor of Float8DynamicActivationFloat8WeightConfig with packing_format=Float8PackingFormat.SPARSE_CUTLASS and granularity=PerRow(). Remove the class definition, handler, and all references from imports, tests, and benchmarks. Co-authored-by: Cursor <cursoragent@cursor.com> [ghstack-poisoned]

This config was deprecated in favor of Int8DynamicActivationIntxWeightConfig. Remove the class definition, handler, and all references from imports, tests, QAT code, benchmarks, and documentation. Update QAT docs to reference Int4WeightOnlyConfig as the example base config. Co-authored-by: Cursor <cursoragent@cursor.com> [ghstack-poisoned]

This config was deprecated and scheduled for deletion. Remove the class definition, handler, and all references from imports, tests, benchmarks, and documentation. Co-authored-by: Cursor <cursoragent@cursor.com> [ghstack-poisoned]

Remove the config class, its supporting classes (Float8ObservedLinear, Float8ObservedSoftmax, Float8QuantizedSoftmax), the handler function, and all references from imports and tests. Co-authored-by: Cursor <cursoragent@cursor.com> [ghstack-poisoned]

This config was deprecated and scheduled for deletion. Remove the class definition, handler, and all references from imports, tests, benchmarks, and the autoround eval script. This also removes the entire BC import block from quant_api.py since all prototype configs have been removed. Co-authored-by: Cursor <cursoragent@cursor.com> [ghstack-poisoned]

This config was deprecated and scheduled for deletion. Remove the class definition, handler, and all references from imports, tests, benchmarks, and the autoround eval script. This also removes the entire BC import block from quant_api.py since all prototype configs have been removed. Co-authored-by: Cursor <cursoragentcursor.com> [ghstack-poisoned]

Summary: This PR removes CUSTOM_PARAM_QUANTIZATION_SUPPORTED_CONFIGS, in favor of using `inspect.signature` to ensure that the given handler has a parameter_name kwarg we can use to pass in the param fqn. Test Plan: ``` pytest test/quantization/test_quant_api -k fqn ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Enable parameter-level quantization by accepting a parameter_name kwarg, using getattr/setattr instead of hard-coded module.weight, and switching to _module_extra_repr with partial for flexible repr. Co-authored-by: Cursor <cursoragent@cursor.com> [ghstack-poisoned]

pytorch-bot · 2026-02-17T03:36:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3901

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 7657f08 with merge base d4c1ba3 ():

NEW FAILURES - The following jobs have failed:

PR Label Check / Check PR Labels (gh)
Process completed with exit code 1.
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/quantization/pt2e/test_x86inductor_quantizer.py::TestQuantizePT2EX86Inductor::test_set_module_name_with_mixed_configs

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…_only_transform" Enable parameter-level quantization by accepting a parameter_name kwarg, using getattr/setattr instead of hard-coded module.weight, and switching to _module_extra_repr with partial for flexible repr. Co-authored-by: Cursor <cursoragentcursor.com> [ghstack-poisoned]