Skip to content

Delete deprecated PlainLayout, PlainAQTTensorImpl and related v1 code paths#4151

Merged
jerryzh168 merged 63 commits intomainfrom
gh/jerryzh168/67/head
Apr 2, 2026
Merged

Delete deprecated PlainLayout, PlainAQTTensorImpl and related v1 code paths#4151
jerryzh168 merged 63 commits intomainfrom
gh/jerryzh168/67/head

Conversation

@jerryzh168
Copy link
Copy Markdown
Contributor

@jerryzh168 jerryzh168 commented Mar 23, 2026

Stack from ghstack (oldest at bottom):

  • Remove PlainLayout class from dtypes/utils.py
  • Delete torchao/dtypes/uintx/plain_layout.py
  • Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
  • Remove AQT embedding dispatch (used PlainAQTTensorImpl)
  • Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
  • Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
  • Remove PlainLayout from public exports
  • Update tests to use v2 tensor types

… paths

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Mar 23, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4151

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 662c670 with merge base 0c29e81 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…sorImpl and related v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…ted v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…sorImpl and related v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…ted v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…sorImpl and related v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…ted v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
@jerryzh168 jerryzh168 added the module: not user facing Use this tag if you don't want this PR to show up in release notes label Mar 23, 2026
…sorImpl and related v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…ted v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…sorImpl and related v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…ted v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Apr 2, 2026
… paths

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

ghstack-source-id: 630ea48
Pull Request resolved: #4151
@jerryzh168 jerryzh168 changed the base branch from gh/jerryzh168/67/base to main April 2, 2026 18:11
@jerryzh168 jerryzh168 requested review from svekars and vkuzo as code owners April 2, 2026 18:11
…sorImpl and related v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
…ted v1 code paths"

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

[ghstack-poisoned]
jerryzh168 added a commit that referenced this pull request Apr 2, 2026
… paths

- Remove PlainLayout class from dtypes/utils.py
- Delete torchao/dtypes/uintx/plain_layout.py
- Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table
- Remove AQT embedding dispatch (used PlainAQTTensorImpl)
- Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path)
- Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path)
- Remove PlainLayout from public exports
- Update tests to use v2 tensor types

ghstack-source-id: fcdc7a0
Pull Request resolved: #4151
@jerryzh168 jerryzh168 merged commit b1ddd15 into main Apr 2, 2026
43 checks passed
andrewor14 added a commit that referenced this pull request Apr 7, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully
superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor,
IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor.

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT exports
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype migrations in this commit:
- torchao/prototype/autoround/: migrated off AQT, uses
  IntxUnpackedToInt8Tensor and TorchAOBaseTensor
- torchao/prototype/quantization/mixed_precision/: added assertion error
  since feature was already broken by PlainLayout deletion (#4151)

Still broken (predates this commit, tracked with TODOs):
- torchao/prototype/dtypes/uintx/uintx_utils.py (AQTTensorImpl deleted)
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)

Docs/comments only (not broken, just stale references):
- torchao/prototype/quantization/module_swap/ (README)
- torchao/prototype/parq/ (README)
- torchao/prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 7, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully
superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor,
IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor.

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT exports
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype migrations in this commit:
- torchao/prototype/autoround/: migrated off AQT, uses
  IntxUnpackedToInt8Tensor and TorchAOBaseTensor
- torchao/prototype/quantization/mixed_precision/: added assertion error
  since feature was already broken by PlainLayout deletion (#4151)

Still broken (predates this commit, tracked with TODOs):
- torchao/prototype/dtypes/uintx/uintx_utils.py (AQTTensorImpl deleted)
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)

Docs/comments only (not broken, just stale references):
- torchao/prototype/quantization/module_swap/ (README)
- torchao/prototype/parq/ (README)
- torchao/prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully
superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor,
IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor.

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype status:
- prototype/autoround/: everything works except for
  `apply_auto_round()`, which was already broken before
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)
- prototype/parq: removed unused layout field

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
AffineQuantizedTensor was the v1 quantized tensor system, now fully
superseded by v2 tensor types (Int8Tensor, Int4Tensor, Float8Tensor,
IntxUnpackedToInt8Tensor, etc.) that inherit from TorchAOBaseTensor.

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype status:
- prototype/autoround/: everything works except for
  `apply_auto_round()`, which was already broken before
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)
- prototype/parq: removed unused layout field

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
**Summary: ** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 8, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
andrewor14 added a commit that referenced this pull request Apr 9, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue #1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
brucechanglongxu pushed a commit to brucechanglongxu/ao that referenced this pull request Apr 9, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue pytorch#1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (pytorch#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
brucechanglongxu pushed a commit to brucechanglongxu/ao that referenced this pull request Apr 9, 2026
**Summary:** AffineQuantizedTensor was the v1 quantized tensor
system, now fully superseded by v2 tensor types (Int8Tensor,
Int4Tensor, Float8Tensor, IntxUnpackedToInt8Tensor, etc.) that
inherit from TorchAOBaseTensor.

**BC-Breaking notes:**

Before (AQT):
```python
from torchao.dtypes import to_affine_quantized_intx
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# Low-level AQT API
weight = to_affine_quantized_intx(
    weight, mapping_type, block_size, target_dtype,
    quant_min, quant_max, eps, _layout=Layout(),
)

# High-level API (unchanged)
quantize_(model, Int4WeightOnlyConfig())
```

After (v2 tensors):
```python
from torchao.quantization import quantize_, Int4WeightOnlyConfig

# High-level API (unchanged, recommended)
quantize_(model, Int4WeightOnlyConfig())

# Low-level v2 API (if needed)
from torchao.quantization import Int4Tensor, IntxUnpackedToInt8Tensor
weight = Int4Tensor.from_hp(weight, block_size)
weight = IntxUnpackedToInt8Tensor.from_hp(weight, block_size, torch.int4)
```

**Detailed changes:**

Core deletions:
- torchao/dtypes/affine_quantized_tensor.py (class definition)
- torchao/dtypes/affine_quantized_tensor_ops.py (aten dispatch)
- torchao/dtypes/floatx/, torchao/dtypes/uintx/ (empty subpackages)
- torchao/dtypes/README.md (stale AQT-centric docs)
- torchao/dtypes/utils.py: removed Layout class and AQTTensorImpl class
- torchao/dtypes/__init__.py: removed all AQT and Layout exports
- torchao/utils.py: removed _register_layout, _get_tensor_impl_constructor,
  and their classmethod registrations on TorchAOBaseTensor
- test/dtypes/test_affine_quantized.py
- test/dtypes/test_affine_quantized_tensor_parallel.py

Core updates:
- quant_api.py: removed AQT from _is_linear check, removed 5 dead
  activation quant helpers
- testing/utils.py: switched defaults from AQT to Int8Tensor
- Updated test assertions, docstrings, and docs to remove AQT references

Prototype updates:
- prototype/autoround/: removed broken AQT imports, updated isinstance
  checks to TorchAOBaseTensor. Everything works except apply_auto_round()
  which was already broken before this PR (issue pytorch#1690).
- prototype/dtypes/uintx/uintx_utils.py: removed UintxLayout,
  UintxAQTTensorImpl, and AQT imports (fixes codebook import breakage)
- prototype/quantization/mixed_precision/: added assertion error since
  feature was already broken by PlainLayout deletion (pytorch#4151)

Still broken (tracked with TODOs):
- tutorials/calibration_flow/ (uses to_affine_quantized_intx_static)
- tutorials/developer_api_guide/ (uses Layout)

Docs/comments only (not broken, just stale references):
- prototype/quantization/module_swap/ (README)
- prototype/parq/ (README)
- prototype/quantized_training/ (comments)
haotongzou added a commit to haotongzou/ao that referenced this pull request Apr 24, 2026
Restore scale_dtype=torch.float32 and quant_min=-127 in the v2 int8
quantization path to match the old behavior that was lost in PR pytorch#4151.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants