Add codebook (look up table based) quantization flow in torchao

Similar to affine quantization, we can implement codebook or look up table based quantization, which is another popular type of quantization, especially for lower bits like 4 bits or below (used in https://github.com/Vahe1994/AQLM, https://arxiv.org/abs/2402.04396 etc.). We can start with post training quantization and use k-means clustering to find the codebook / lookup table. You can check out https://github.com/pytorch/ao/issues/391 for the overall structure of torchao stack. Reference code for k-means can be found [here](https://github.com/apple/coremltools/blob/40f6705a6a4c5ef1a616f01b444f1d6e5a0c1f59/coremltools/optimize/torch/_utils/k_means.py#L1008).

After this we can also add more support for the advanced algorithms mentioned above.


API
```
quantize_(model, codebook_weight_only(dtype=torch.uint4))
```

Implementation details:
* [PR1] Ops
  * quantize_codebook(tensor, codebook)
  * dequantize_codebook(tensor, codebook)
* [PR2] Tensor Subclass
  * CodebookQuantizedTensor (similar to AffineQuantizedTensor)
    * clustering algorithm can be implemented in from_float function

Needs to flesh out the details of args etc. but can be done in the PR. I'd suggest to gradually add things and gather feedback.

Code Location: add a `codebook` folder under https://github.com/pytorch/ao/tree/main/torchao/prototype/quantization




```[tasklist]
### Tasks
- [x] Initial support https://github.com/pytorch/ao/pull/1299/
- [ ] Add AQLM support
- [ ] Currently it's significantly slower compared to other methods, we need to speed it up: https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#codebook-quantization
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add codebook (look up table based) quantization flow in torchao #1195

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add codebook (look up table based) quantization flow in torchao #1195

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions