Add scaled_mm python API, test by slayton58 · Pull Request #164142 · pytorch/pytorch

slayton58 · 2025-09-29T15:03:20Z

Stack from ghstack (oldest at bottom):

Summary:

Add torch.nn.functional.scaled_mm as an abstraction around the C++
methods
Wraps torch._scaled_mm_v2 API by default, but user can force use of
the older torch._scaled_mm interface.
Scaled MM tests now run on the new API

Test Plan:

pytest test/test_scaled_matmul_cuda.py

Reviewers:

Subscribers:

Tasks:

Tags:
Signed-off-by: Simon Layton <simonlaytonmeta.com>

[ghstack-poisoned]

pytorch-bot · 2025-09-29T15:03:24Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164142

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 946102e with merge base 3288fbf ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: * Add `torch.quantization.scaled_mm` as an abstraction around the C++ methods * Wraps `torch._scaled_mm_v2` API by default, but user can force use of the older `torch._scaled_mm` interface. * Scaled MM tests now run on the new API Test Plan: `pytest test/test_scaled_matmul_cuda.py` Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Simon Layton <simonlaytonmeta.com> ghstack-source-id: 6f762a3 Pull-Request: #164142

slayton58 · 2025-09-29T16:43:23Z

@pytorchbot rebase

vkuzo · 2025-09-29T16:44:49Z

docs/source/quantization.rst

 Quantization
 ============

+.. automodule:: torch.quantization.scaled_mm


nit: i'd probably not add stuff to torch.quantization, as it's deprecated and the name is ambiguous. Can we have a more specific name for scaled_mm, or just keep it together with the non-scaled gemms in terms of naming / docs?

I'm fine with this going wherever - It's in torch.quantization because there wasn't a clearly better place for it to go - it's not a functional version of a torch.nn op, so torch.nn.functional didn't seem like a good place

Post some offline discussion, moved to torch.nn.functional.scaled_mm

pytorchmergebot · 2025-09-29T16:51:29Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

[ghstack-poisoned]

Summary: * Add `torch.quantization.scaled_mm` as an abstraction around the C++ methods * Wraps `torch._scaled_mm_v2` API by default, but user can force use of the older `torch._scaled_mm` interface. * Scaled MM tests now run on the new API Test Plan: `pytest test/test_scaled_matmul_cuda.py` Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Simon Layton <simonlaytonmeta.com> ghstack-source-id: e727cbc Pull-Request: #164142

pytorchmergebot · 2025-09-29T16:51:49Z

Successfully rebased gh/slayton58/17/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/164142)

[ghstack-poisoned]

docs/source/nn.functional.rst

drisspg · 2025-10-07T20:04:42Z

test/test_fx.py

        "adaptive_avg_pool3d": LEN_ERROR,
        "adaptive_max_pool2d_with_indices": LEN_ERROR,
        "adaptive_max_pool3d_with_indices": LEN_ERROR,
+        "scaled_mm": LEN_ERROR,


OOC what does this signify?

len(...) is non-traceable by default (errors out during the test), and scaled_mm uses len for some list processing - this prevents the test from running.

Does this mean that this new op won't be compilable ?

I guess so - I see 2 immediate ways to fix this:

remove the error-checking for the deprecated fallback path and just pass the first scale etc. from the passed list, and rely on erroring out in C++ with an invalid scaling recipe

Remove the deprecated fallback path entirely

torch/nn/functional.py

drisspg · 2025-10-07T20:07:59Z

torch/nn/functional.py

+    if len(kwargs) > 0:
+        raise RuntimeError("kwargs contains unexpected entries, ", kwargs.keys())
+
+    if use_deprecated_api:


OOC why this?

The deprecated_api path? It allows for a back-compat path to isolate any differences from the implementations. I found it incredibly useful for debugging, I guess it could be removed if desired..

[ghstack-poisoned]

Summary: * Add `torch.nn.functional.scaled_mm` as an abstraction around the C++ methods * Wraps `torch._scaled_mm_v2` API by default, but user can force use of the older `torch._scaled_mm` interface. * Scaled MM tests now run on the new API Test Plan: `pytest test/test_scaled_matmul_cuda.py` Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Simon Layton <simonlaytonmeta.com> [ghstack-poisoned]

drisspg

Looks good in a follow up we should have some more robust composability testing, the immediate things that come to mind is writing meta_registrations.py entry adding some compile tests and then making sure we (for now) fallback in inductor and then ultimately rewire the lowerings.

cc @eellison is the best way to get most of this testing still through common_methods_invocations.py?

Summary: * Add `torch.nn.functional.scaled_mm` as an abstraction around the C++ methods * Wraps `torch._scaled_mm_v2` API by default, but user can force use of the older `torch._scaled_mm` interface. * Scaled MM tests now run on the new API Test Plan: `pytest test/test_scaled_matmul_cuda.py` Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Simon Layton <simonlaytonmeta.com> [ghstack-poisoned]

slayton58 · 2025-10-08T18:44:54Z

@pytorchbot merge

pytorchmergebot · 2025-10-08T18:47:03Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-10-08T20:17:26Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2022-cuda12.6-py3 / build

Details for Dev Infra team

Raised by workflow job

Summary: * Add `torch.nn.functional.scaled_mm` as an abstraction around the C++ methods * Wraps `torch._scaled_mm_v2` API by default, but user can force use of the older `torch._scaled_mm` interface. * Scaled MM tests now run on the new API Test Plan: `pytest test/test_scaled_matmul_cuda.py` Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Simon Layton <simonlaytonmeta.com> [ghstack-poisoned]

slayton58 · 2025-10-08T20:35:21Z

@pytorchbot merge

pytorchmergebot · 2025-10-08T20:37:16Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-10-08T22:35:17Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 2, 5, linux.g6.4xlarge.experimental.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

[ghstack-poisoned]

slayton58 · 2025-10-08T23:45:02Z

@pytorchbot merge

pytorchmergebot · 2025-10-08T23:47:01Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-10-09T00:46:42Z

Merge failed

Reason: 2 jobs have failed, first few of them are: trunk / linux-jammy-rocm-py3.10 / test (default, 2, 2, linux.rocm.gpu.gfx942.1), trunk / linux-jammy-rocm-py3.10 / test (default, 1, 2, linux.rocm.gpu.gfx942.1)

Details for Dev Infra team

Raised by workflow job

slayton58 · 2025-10-09T12:35:34Z

@pytorchbot merge

pytorchmergebot · 2025-10-09T12:37:30Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Update

993b960

[ghstack-poisoned]

slayton58 requested a review from jerryzh168 as a code owner September 29, 2025 15:03

pytorch-bot bot added the release notes: quantization release notes category label Sep 29, 2025

This was referenced Sep 29, 2025

Add _scaled_mm_v2 API #164141

Closed

Split Scaled matmul tests into a separate file #163855

Closed

slayton58 requested review from drisspg, ngimel and vkuzo September 29, 2025 15:27

vkuzo reviewed Sep 29, 2025

View reviewed changes

Update

0cf2d9f

[ghstack-poisoned]

Update

06d9342

[ghstack-poisoned]

slayton58 requested review from EikanWang, XuehaiPan, angelayi, avikchaudhuri, bobrenjc93, eqy, gujinghui, laithsakka, mikaylagawarecki, syed-ahmed, tugsbayasgalan, ydwu4 and zhxchen17 as code owners September 29, 2025 19:34

slayton58 added 4 commits October 6, 2025 17:13

Update

673904e

[ghstack-poisoned]

Update

8bdbd85

[ghstack-poisoned]

Update

655e294

[ghstack-poisoned]

Update

6faaede

[ghstack-poisoned]

drisspg reviewed Oct 7, 2025

View reviewed changes

docs/source/nn.functional.rst Show resolved Hide resolved

drisspg reviewed Oct 7, 2025

View reviewed changes

torch/nn/functional.py Show resolved Hide resolved

drisspg reviewed Oct 7, 2025

View reviewed changes

slayton58 added 7 commits October 8, 2025 06:57

Update

74cda6f

[ghstack-poisoned]

Update

5300164

[ghstack-poisoned]

Update

a3c0eaa

[ghstack-poisoned]

Update

a02e611

[ghstack-poisoned]

Update

97c7f7e

[ghstack-poisoned]

drisspg approved these changes Oct 8, 2025

View reviewed changes

Update

946102e

[ghstack-poisoned]

Conversation

slayton58 commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164142

✅ No Failures

Uh oh!

slayton58 commented Sep 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Sep 29, 2025

Uh oh!

pytorchmergebot commented Sep 29, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drisspg left a comment

Choose a reason for hiding this comment

Uh oh!

slayton58 commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Merge started

Uh oh!

pytorchmergebot commented Oct 8, 2025

Merge failed

Uh oh!

slayton58 commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Merge started

Uh oh!

pytorchmergebot commented Oct 8, 2025

Merge failed

Uh oh!

slayton58 commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Merge started

Uh oh!

pytorchmergebot commented Oct 9, 2025

Merge failed

Uh oh!

slayton58 commented Oct 9, 2025

Uh oh!

pytorchmergebot commented Oct 9, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

slayton58 commented Sep 29, 2025 •

edited

Loading

pytorch-bot bot commented Sep 29, 2025 •

edited

Loading