Delete deprecated TensorCoreTiledLayout and related code by jerryzh168 · Pull Request #4153 · pytorch/ao

jerryzh168 · 2026-03-23T20:09:23Z

Stack from ghstack (oldest at bottom):

Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py
Remove TensorCoreTiledLayout dispatch from AQT dispatch table
Remove TensorCoreTiledLayout from all public exports
Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts
Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize
Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True
Remove dead v1 code path in fake_quantize_config.py
Update autoround to use Int4TilePackedTo4dTensor directly

[ghstack-poisoned]

…nd related code [ghstack-poisoned]

… paths - Remove PlainLayout class from dtypes/utils.py - Delete torchao/dtypes/uintx/plain_layout.py - Remove int8 weight and int8 dynamic activation dispatch from AQT dispatch table - Remove AQT embedding dispatch (used PlainAQTTensorImpl) - Change Int8WeightOnlyConfig default to version=2 (removes v1 AQT path) - Change Int8DynamicActivationInt8WeightConfig default to version=2 (removes v1 AQT path) - Remove PlainLayout from public exports - Update tests to use v2 tensor types [ghstack-poisoned]

- Copy bitpacking.py from torchao/dtypes/uintx/ to torchao/prototype/dtypes/uintx/ (original was deleted in the previous commit) - Add uintx_utils.py with UintxTensor and UintxLayout classes needed by autoround and codebook features until they migrate to the new quantization design [ghstack-poisoned]

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly [ghstack-poisoned]

pytorch-bot · 2026-03-23T20:09:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4153

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7415455 with merge base 3e6c496 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly ghstack-source-id: 7132050 Pull Request resolved: #4153

…d related code" - Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly [ghstack-poisoned]

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly [ghstack-poisoned]

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly ghstack-source-id: c23e77a Pull Request resolved: #4153

…d related code" - Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly [ghstack-poisoned]

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly [ghstack-poisoned]

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly ghstack-source-id: 863d91a Pull Request resolved: #4153

- Delete torchao/dtypes/uintx/tensor_core_tiled_layout.py - Remove TensorCoreTiledLayout dispatch from AQT dispatch table - Remove TensorCoreTiledLayout from all public exports - Remove LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts - Remove TensorCoreTiledLayout padding logic from AffineQuantizedTensor.dequantize - Simplify HQQ path in AffineQuantizedTensor to always use raw_output=True - Remove dead v1 code path in fake_quantize_config.py - Update autoround to use Int4TilePackedTo4dTensor directly [ghstack-poisoned]

jerryzh168 · 2026-03-28T00:30:43Z

@claude review

claude · 2026-03-28T00:31:02Z

Claude finished @jerryzh168's task in 2m 20s —— View job

Review: Delete deprecated TensorCoreTiledLayout and related code

Gather context and read changed files
Review deleted tensor_core_tiled_layout.py
Review changes to affine_quantized_tensor.py and ops
Review changes to autoround/core.py
Review remaining changes (exports, init files, etc.)
Post review summary

Overall this is a clean deprecation removal. The changes are straightforward and consistent across the codebase. A few observations:

Looks Good

Complete removal of TensorCoreTiledLayout references — grep confirms zero remaining references to TensorCoreTiledLayout or tensor_core_tiled in the codebase (aside from one stale comment, noted below).
affine_quantized_tensor.py simplifications — The HQQ path simplification at line 278 (raw_output=not isinstance(_layout, PlainLayout)) is correct. Since TensorCoreTiledLayout is removed, the original condition not isinstance(_layout, (TensorCoreTiledLayout, PlainLayout)) simplifies correctly. The padding removal in dequantize() is also correct since only TensorCoreTiledLayout needed that post-dequant narrow.
affine_quantized_tensor_ops.py — Clean removal of the dispatch entry for _linear_bf16_act_uint4_weight_check/impl.
fake_quantize_config.py — Dead version == 1 code path removed. The remaining code only supports version == 2, which is the right state.
quant_api.py — LAYOUT_TO_ZERO_POINT_DOMAIN and LAYOUT_TO_PRESERVE_ZEROS dicts cleanly removed. No remaining consumers.
autoround core.py migration — The migration from TensorCoreTiledLayout-based to_affine_quantized_intx_static to directly constructing Int4TilePackedTo4dTensor looks correct. The padding, packing (_convert_weight_to_int4pack), and pack_tinygemm_scales_and_zeros logic faithfully reproduces what TensorCoreTiledLayout.from_plain + pre_process used to do.

Minor Issues

Stale comment in quant_primitives.py:1333 — There's still a reference to TensorCoreTiledFormat in a comment:
```
# this is not preserving zero_point, this is converting to TensorCoreTiledFormat
```
This should be updated to reference "tinygemm format" to match the updated comment at line 1982.
Fix this →
torchao/dtypes/uintx/__init__.py should be deleted — Per andrewor14's review comment (already acknowledged by the author): the file now only contains __all__ = [] and nothing imports from torchao.dtypes.uintx as a package (existing imports go directly to torchao.dtypes.uintx.plain_layout). This file can be safely deleted.
Fix this →
Unused import removal — In torchao/prototype/autoround/core.py, the import from torchao.dtypes import to_affine_quantized_intx_static at line 14 is no longer used by the to_int4_tinygemm_weight function (which now constructs Int4TilePackedTo4dTensor directly). However, it's still used by the to_uintx_weight path at line 205, so this is fine. Disregard.

Summary

The PR is a clean removal of deprecated code. The two actionable items are the stale comment at quant_primitives.py:1333 and deleting the now-empty uintx/__init__.py file (already agreed upon in prior review).