Delete deprecated PlainLayout, PlainAQTTensorImpl and related v1 code paths#4151
Merged
jerryzh168 merged 63 commits intomainfrom Apr 2, 2026
Merged
Delete deprecated PlainLayout, PlainAQTTensorImpl and related v1 code paths#4151jerryzh168 merged 63 commits intomainfrom
jerryzh168 merged 63 commits intomainfrom
Conversation
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
…nd related code [ghstack-poisoned]
… paths - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4151
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 662c670 with merge base 0c29e81 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Mar 23, 2026
…sorImpl and related v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…ted v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…sorImpl and related v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…ted v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…sorImpl and related v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…ted v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…sorImpl and related v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…ted v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…sorImpl and related v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…ted v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
jerryzh168
added a commit
that referenced
this pull request
Apr 2, 2026
… paths - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types ghstack-source-id: 630ea48 Pull Request resolved: #4151
…sorImpl and related v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
…ted v1 code paths" - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]
jerryzh168
added a commit
that referenced
this pull request
Apr 2, 2026
… paths - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types ghstack-source-id: fcdc7a0 Pull Request resolved: #4151
17 tasks
andrewor14
added a commit
that referenced
this pull request
Apr 7, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor. Core deletions: - torchao/dtypes/affine_quantized_tensor.py (class definition) - torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch) - torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages) - torchao/dtypes/README.md (stale AQT-centric docs) - torchao/dtypes/utils.py: removed AQTTensorImpl class - torchao/dtypes/__init__.py: removed all AQT exports - test/dtypes/test_affine_quantized.py - test/dtypes/test_affine_quantized_tensor_parallel.py Core updates: - quant_api.py: removed AQT from _is_linear check, removed 5 dead activation quant helpers - testing/utils.py: switched defaults from AQT to Int8Tensor - Updated test assertions, docstrings, and docs to remove AQT references Prototype migrations in this commit: - torchao/prototype/autoround/: migrated off AQT, uses IntxUnpackedToInt8Tensor and TorchAOBaseTensor - torchao/prototype/quantization/mixed_precision/: added assertion error since feature was already broken by PlainLayout deletion (#4151) Still broken (predates this commit, tracked with TODOs): - torchao/prototype/dtypes/uintx/uintx_utils.py (AQTTensorImpl deleted) - tutorials/calibration_flow/ (uses to_affine_quantized_intx_static) Docs/comments only (not broken, just stale references): - torchao/prototype/quantization/module_swap/ (README) - torchao/prototype/parq/ (README) - torchao/prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 7, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor. Core deletions: - torchao/dtypes/affine_quantized_tensor.py (class definition) - torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch) - torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages) - torchao/dtypes/README.md (stale AQT-centric docs) - torchao/dtypes/utils.py: removed AQTTensorImpl class - torchao/dtypes/__init__.py: removed all AQT exports - test/dtypes/test_affine_quantized.py - test/dtypes/test_affine_quantized_tensor_parallel.py Core updates: - quant_api.py: removed AQT from _is_linear check, removed 5 dead activation quant helpers - testing/utils.py: switched defaults from AQT to Int8Tensor - Updated test assertions, docstrings, and docs to remove AQT references Prototype migrations in this commit: - torchao/prototype/autoround/: migrated off AQT, uses IntxUnpackedToInt8Tensor and TorchAOBaseTensor - torchao/prototype/quantization/mixed_precision/: added assertion error since feature was already broken by PlainLayout deletion (#4151) Still broken (predates this commit, tracked with TODOs): - torchao/prototype/dtypes/uintx/uintx_utils.py (AQTTensorImpl deleted) - tutorials/calibration_flow/ (uses to_affine_quantized_intx_static) Docs/comments only (not broken, just stale references): - torchao/prototype/quantization/module_swap/ (README) - torchao/prototype/parq/ (README) - torchao/prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor. Core deletions: - torchao/dtypes/affine_quantized_tensor.py (class definition) - torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch) - torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages) - torchao/dtypes/README.md (stale AQT-centric docs) - torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class - torchao/dtypes/__init__.py: removed all AQT and Layout exports - torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor, and their classmethod registrations on TorchAOBaseTensor - test/dtypes/test_affine_quantized.py - test/dtypes/test_affine_quantized_tensor_parallel.py Core updates: - quant_api.py: removed AQT from _is_linear check, removed 5 dead activation quant helpers - testing/utils.py: switched defaults from AQT to Int8Tensor - Updated test assertions, docstrings, and docs to remove AQT references Prototype status: - prototype/autoround/: everything works except for `apply_auto_round()`, which was already broken before - prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout, UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage) - prototype/quantization/mixed_precision/: added assertion error since feature was already broken by PlainLayout deletion (#4151) - prototype/parq: removed unused layout field Still broken (tracked with TODOs): - tutorials/calibration_flow/ (uses to_affine_quantized_intx_static) - tutorials/developer_api_guide/ (uses Layout) Docs/comments only (not broken, just stale references): - prototype/quantization/module_swap/ (README) - prototype/parq/ (README) - prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor. Core deletions: - torchao/dtypes/affine_quantized_tensor.py (class definition) - torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch) - torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages) - torchao/dtypes/README.md (stale AQT-centric docs) - torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class - torchao/dtypes/__init__.py: removed all AQT and Layout exports - torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor, and their classmethod registrations on TorchAOBaseTensor - test/dtypes/test_affine_quantized.py - test/dtypes/test_affine_quantized_tensor_parallel.py Core updates: - quant_api.py: removed AQT from _is_linear check, removed 5 dead activation quant helpers - testing/utils.py: switched defaults from AQT to Int8Tensor - Updated test assertions, docstrings, and docs to remove AQT references Prototype status: - prototype/autoround/: everything works except for `apply_auto_round()`, which was already broken before - prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout, UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage) - prototype/quantization/mixed_precision/: added assertion error since feature was already broken by PlainLayout deletion (#4151) - prototype/parq: removed unused layout field Still broken (tracked with TODOs): - tutorials/calibration_flow/ (uses to_affine_quantized_intx_static) - tutorials/developer_api_guide/ (uses Layout) Docs/comments only (not broken, just stale references): - prototype/quantization/module_swap/ (README) - prototype/parq/ (README) - prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
**Summary: ** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14
added a commit
that referenced
this pull request
Apr 9, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
brucechanglongxu
pushed a commit
to brucechanglongxu/ao
that referenced
this pull request
Apr 9, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue pytorch#1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (pytorch#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
brucechanglongxu
pushed a commit
to brucechanglongxu/ao
that referenced
this pull request
Apr 9, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.
**BC-Breaking notes:**
Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# Low-level AQT API
weight = to_affine_quantized_intx(
weight, mapping_type, block_size, target_dtype,
quant_min, quant_max, eps, _layout=Layout(),
)
# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```
After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig
# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())
# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```
**Detailed changes:**
Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py
Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references
Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
checks to TorchAOBaseTensor. Everything works except apply_auto_round()
which was already broken before this PR (issue pytorch#1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
feature was already broken by PlainLayout deletion (pytorch#4151)
Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)
Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
This was referenced Apr 14, 2026
haotongzou
added a commit
to haotongzou/ao
that referenced
this pull request
Apr 24, 2026
Restore scale_dtype=torch.float32 and quant_min=-127 in the v2 int8 quantization path to match the old behavior that was lost in PR pytorch#4151.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):