Add blockwise quantized dot support by lsy323 · Pull Request #7605 · pytorch/xla

lsy323 · 2024-07-02T06:10:03Z

Add blockwise quantized dot support for 8-bit and 4-bit weight

Test:

Added unit tests

JackCaoG · 2024-07-08T20:10:22Z

We should also have a doc to explain how to use these quantized ops under https://github.com/pytorch/xla/tree/master/docs

lsy323 · 2024-07-08T20:10:45Z

We should also have a doc to explain how to use these quantized ops under https://github.com/pytorch/xla/tree/master/docs

It's in https://github.com/pytorch/xla/blob/master/docs/quantized_ops.md

miladm · 2024-07-09T22:06:28Z

+
+        # Dot with int4 weight is only supported on TPU
+        if not (n_bit == 4 and xr.device_type() != 'TPU'):
+          m = m.to(device)


what's the behavior on CUDA and CPU device?

Because int4 only runs on XLA:TPU

offline discussion summary: int4 works only on XLA:TPU today; XLA:CPU does not support INT4. XLA:GPU level of support is unclear as it is not tested currently.

Siyuan Liu and others added 3 commits July 2, 2024 03:57

add blockwise quant

b260263

update test

9febaa8

update readme

683c8a8

lsy323 changed the title ~~Add blockwise quant~~ Add blockwise quant support Jul 2, 2024

lsy323 changed the title ~~Add blockwise quant support~~ Add blockwise quantized op support Jul 2, 2024

lsy323 changed the title ~~Add blockwise quantized op support~~ Add blockwise quantized dot support Jul 2, 2024

lsy323 added the quantization label Jul 2, 2024

fix test

6bae4a0

lsy323 marked this pull request as ready for review July 2, 2024 16:18

lsy323 requested review from JackCaoG, ManfeiBai and miladm July 2, 2024 16:18

ManfeiBai approved these changes Jul 2, 2024

View reviewed changes

lsy323 mentioned this pull request Jul 3, 2024

Asymmetric quantized matmul support #7626

Merged

JackCaoG reviewed Jul 8, 2024

View reviewed changes

Comment thread torch_xla/experimental/xla_quantized_matmul.py

JackCaoG reviewed Jul 8, 2024

View reviewed changes

Comment thread torch_xla/experimental/xla_quantized_matmul.py Outdated

JackCaoG approved these changes Jul 8, 2024

View reviewed changes

update doc str

5b7e48e

lsy323 merged commit 88bcb45 into master Jul 8, 2024

lsy323 deleted the lsiyuan/blockwise-quant branch July 8, 2024 22:45

miladm reviewed Jul 9, 2024

View reviewed changes

ManfeiBai added the 2024Q3-manfei-review label Sep 26, 2024

lsy323 removed the 2024Q3-manfei-review label Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add blockwise quantized dot support#7605

Add blockwise quantized dot support#7605
lsy323 merged 5 commits intomasterfrom
lsiyuan/blockwise-quant

lsy323 commented Jul 2, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

JackCaoG commented Jul 8, 2024

Uh oh!

lsy323 commented Jul 8, 2024

Uh oh!

miladm Jul 9, 2024 •

edited

Loading

Uh oh!

lsy323 Jul 9, 2024

Uh oh!

miladm Jul 9, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

lsy323 commented Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JackCaoG commented Jul 8, 2024

Uh oh!

lsy323 commented Jul 8, 2024

Uh oh!

miladm Jul 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsy323 Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

miladm Jul 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lsy323 commented Jul 2, 2024 •

edited

Loading

miladm Jul 9, 2024 •

edited

Loading

miladm Jul 9, 2024 •

edited

Loading