Add support for bmm and `to` for fbgemm Tensor by jerryzh168 · Pull Request #2337 · pytorch/ao

jerryzh168 · 2025-06-08T01:02:41Z

Summary:
att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first

Test Plan:
python test/dtypes/test_fbgemm_fp8.py -k test_bmm
python test/dtypes/test_fbgemm_int4.py -k test_bmm

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-06-08T01:02:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2337

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 06211ee with merge base 4235837 ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/integration/test_integration.py::TestSubclass::test_int4_weight_only_quant_subclass_grouped_5_cuda

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg · 2025-06-08T03:27:15Z

+
+    # not used
+    num_tokens = torch.empty([input_tensor.size(0)], device=input_tensor.device)
+    xq, x_scale = torch.ops.fbgemm.quantize_fp8_per_row(


This ot use num_tokens feels weird, maybe make an issue on fbgemm? or update the op to not need

yeah I checked with @jiawenliu64 and this arg is indeed only used in internal use cases, he was recommending to use the triton op, although I found the triton op is a bit slower, maybe it requires some tuning. I'll double check

Summary: att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first Test Plan: python test/dtypes/test_fbgemm_fp8.py -k test_bmm python test/dtypes/test_fbgemm_int4.py -k test_bmm Reviewers: Subscribers: Tasks: Tags:

Add support for bmm for fbgemm config Summary: att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first Test Plan: python test/dtypes/test_fbgemm_fp8.py -k test_bmm python test/dtypes/test_fbgemm_int4.py -k test_bmm Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 8, 2025

jerryzh168 requested a review from drisspg June 8, 2025 01:02

jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jun 8, 2025

drisspg reviewed Jun 8, 2025

View reviewed changes

Comment thread torchao/dtypes/fbgemm_fp8_tensor.py

drisspg reviewed Jun 8, 2025

View reviewed changes

Comment thread torchao/dtypes/fbgemm_int4_tensor.py

drisspg reviewed Jun 8, 2025

View reviewed changes

Comment thread torchao/dtypes/fbgemm_int4_tensor.py Outdated

jerryzh168 force-pushed the add-bmm branch 3 times, most recently from 59bc6cf to a02edc9 Compare June 9, 2025 14:24

jerryzh168 changed the title ~~Add support for bmm for fbgemm config~~ Add support for bmm and to for fbgemm Tensor Jun 9, 2025

jerryzh168 force-pushed the add-bmm branch from a02edc9 to 06211ee Compare June 9, 2025 14:39

drisspg approved these changes Jun 9, 2025

View reviewed changes

jerryzh168 merged commit 16e2d0a into pytorch:main Jun 10, 2025
18 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for bmm and `to` for fbgemm Tensor#2337

Add support for bmm and `to` for fbgemm Tensor#2337
jerryzh168 merged 1 commit into
pytorch:mainfrom
jerryzh168:add-bmm

jerryzh168 commented Jun 8, 2025

Uh oh!

pytorch-bot Bot commented Jun 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

drisspg Jun 8, 2025

Uh oh!

jerryzh168 Jun 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jerryzh168 commented Jun 8, 2025

Uh oh!

pytorch-bot Bot commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2337

❌ 1 New Failure

Uh oh!

Uh oh!

drisspg Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot Bot commented Jun 8, 2025 •

edited

Loading