Skip to content

[ATen][CUDA] Add sm_121a flag for RowwiseScaledMM#167734

Closed
Aidyn-A wants to merge 2 commits intopytorch:mainfrom
Aidyn-A:add_sm121a_for_RowwiseScaledMM
Closed

[ATen][CUDA] Add sm_121a flag for RowwiseScaledMM#167734
Aidyn-A wants to merge 2 commits intopytorch:mainfrom
Aidyn-A:add_sm121a_for_RowwiseScaledMM

Conversation

@Aidyn-A
Copy link
Collaborator

@Aidyn-A Aidyn-A commented Nov 13, 2025

This PR add a sm_121a flag for row-wise scaled matmuls on DGX Spark.

cc @ptrblck @msaroufim @eqy @jerryzh168 @tinglvv

@Aidyn-A Aidyn-A requested review from cyyever and eqy November 13, 2025 16:40
@Aidyn-A Aidyn-A self-assigned this Nov 13, 2025
@Aidyn-A Aidyn-A added the module: cuda Related to torch.cuda, and CUDA support in general label Nov 13, 2025
@Aidyn-A Aidyn-A added topic: not user facing topic category module: floatx (formerly float8) For torch.float8_e5m2 and torch.float8_e4m3 and other sub 8-bit float types labels Nov 13, 2025
@pytorch-bot pytorch-bot bot removed module: cuda Related to torch.cuda, and CUDA support in general module: floatx (formerly float8) For torch.float8_e5m2 and torch.float8_e4m3 and other sub 8-bit float types labels Nov 13, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 13, 2025

The label module: cuda is only applicable to issues and has been removed. Please only use this label on issues.

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 13, 2025

The label module: floatx (formerly float8) is only applicable to issues and has been removed. Please only use this label on issues.

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167734

Note: Links to docs will display an error until the docs builds have been completed.

❌ 7 New Failures, 10 Unrelated Failures

As of commit c162c57 with merge base 2c846bb (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@Aidyn-A Aidyn-A added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 13, 2025
@Skylion007
Copy link
Collaborator

Skylion007 commented Nov 13, 2025

They need their own SM arch? ;-;

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Nov 14, 2025

They need their own SM arch? ;-;

Yes, the arch specific instructions are not compatible between 12.0 and 12.1.

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Nov 14, 2025

The executorch test failures are unrelated.

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: trunk / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge, unstable)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Nov 17, 2025

@pytorchbot revert -m "fails on CUDA 12.8"

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 17, 2025

❌ 🤖 pytorchbot command failed:

@pytorchbot revert: error: the following arguments are required: -c/--classification

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst,autorevert}

Try @pytorchbot --help for more info.

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Nov 17, 2025

@pytorchbot revert -m "fails on CUDA 12.8" -c nosignal

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Nov 17, 2025
This reverts commit 226850c.

Reverted #167734 on behalf of https://github.com/Aidyn-A due to fails on CUDA 12.8 ([comment](#167734 (comment)))
@pytorchmergebot
Copy link
Collaborator

@Aidyn-A your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Nov 17, 2025
@tinglvv
Copy link
Collaborator

tinglvv commented Nov 17, 2025

Thanks for reverting, this should restore the nightly wheel from the below error. we should add the ciflow/binaries label next time..
nvcc fatal : Unsupported gpu architecture 'compute_121a'
cc @atalman , could you help restart the https://github.com/pytorch/pytorch/actions/runs/19421948119/job/55560842460 when you get a chance, I was not able to restart the 12.8 build

@tinglvv tinglvv added the ciflow/binaries Trigger all binary build and upload jobs on the PR label Nov 17, 2025
@atalman
Copy link
Contributor

atalman commented Nov 17, 2025

HI @tinglvv and @Aidyn-A thank you for revert. We will wait for tomorrow build to confirm.

Looks like it caused all CUDA 12.8 builds to fail, Linux x86, aarch64 and Windows x86

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Nov 18, 2025

Yeah, those failures are certainly not related to the sm_121a flag.

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
This PR add a sm_121a flag for row-wise scaled matmuls on DGX Spark.

Pull Request resolved: pytorch#167734
Approved by: https://github.com/eqy, https://github.com/cyyever
Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged open source Reverted topic: not user facing topic category

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants