[reland2][ROCm] preshuffled weight mm by jeffdaily · Pull Request #2207 · pytorch/ao

jeffdaily · 2025-05-13T22:04:10Z

No description provided.

pytorch-bot · 2025-05-13T22:04:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2207

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit aca48ed with merge base 1017c7e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-05-13T22:14:57Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pytorch-bot · 2025-05-14T00:59:03Z

To add the ciflow label ciflow/rocm please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

facebook-github-bot · 2025-05-14T03:52:54Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-05-14T14:54:39Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mxz297 · 2025-05-14T15:17:44Z

@jeffdaily i am having issues of importing this PR. Can you first try to resolve the build errors?

facebook-github-bot · 2025-05-14T15:31:56Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-05-14T18:32:10Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-05-14T23:34:45Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mxz297 · 2025-05-15T17:17:21Z

@jeffdaily there is a linter failure

facebook-github-bot · 2025-05-15T17:20:14Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mxz297 · 2025-05-15T23:10:05Z

@jeffdaily there is also a failure in rocm test

module = Linear(in_features=32, out_features=128, bias=False)
config = MXFPInferenceConfig(block_size=32, activation_dtype=torch.float4_e2m1fn_x2, weight_dtype=torch.float4_e2m1fn_x2, gemm_kernel_choice=<MXGemmKernelChoice.CUTLASS: 'cutlass'>, set_inductor_config=False)

    @register_quantize_module_handler(MXFPInferenceConfig)
    def _mx_inference_linear_transform(
        module: torch.nn.Module, config: MXFPInferenceConfig
    ):
        # TODO Sm120 has slightly more restrictive reqs
        # TODO handle AMD
>       assert is_sm_at_least_100(), "MXFP is only supported on sm100 machiens for now"
E       AssertionError: MXFP is only supported on sm100 machiens for now

but this looks like the test should even not be run on AMD?

cc @drisspg @atalman @jerryzh168

facebook-github-bot · 2025-05-15T23:17:22Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

drisspg · 2025-05-16T01:23:45Z

@mxz297 yeah this should be skipped, can you rebase past: #2209

facebook-github-bot · 2025-05-16T17:08:22Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mxz297 · 2025-05-16T17:59:01Z

@pytorchbot run all

pytorch-bot · 2025-05-16T17:59:04Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'run' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'cherry-pick', 'close')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick,close} ...

Try @pytorchbot --help for more info.

mxz297 · 2025-05-16T17:59:57Z

@pytorchbot drci

mxz297 · 2025-05-16T18:10:18Z

@drisspg @atalman @jerryzh168

Seems to have some CUDA test failures where arch string parsing has some issue. Feels unlikely caused by this PR but want to double check with you folks:

Processing /pytorch/ao
  Preparing metadata (setup.py) ... 25l-� �error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [13 lines of output]
      W0516 16:40:07.414810 215 site-packages/torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-12.6'
      W0516 16:40:07.421015 215 site-packages/torch/utils/cpp_extension.py:2414] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
      W0516 16:40:07.421015 215 site-packages/torch/utils/cpp_extension.py:2414] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 35, in <module>
        File "/pytorch/ao/setup.py", line 544, in <module>
          ext_modules=get_extensions(),
        File "/pytorch/ao/setup.py", line 432, in get_extensions
          cuda_arch_flags = _get_cuda_arch_flags()
        File "/opt/conda/envs/venv/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 2434, in _get_cuda_arch_flags
          arch_list[-1] += '+PTX'
      IndexError: list index out of range

mxz297 · 2025-05-16T18:12:58Z

Also a noob question: how should i restart ci or ci is always automatically restarted after a new code commit push?

drisspg · 2025-05-16T19:11:54Z

@mxz297 so if you are a meta employee it will automatically restart on commit push but unfortunately for everyone else you will need to manually kick it off

mxz297 · 2025-05-19T15:57:45Z

@drisspg @atalman @jerryzh168

Any insight on the following error?

Processing /pytorch/ao
Preparing metadata (setup.py) ... 25l-� �error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [13 lines of output]
W0516 16:40:07.414810 215 site-packages/torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-12.6'
W0516 16:40:07.421015 215 site-packages/torch/utils/cpp_extension.py:2414] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
W0516 16:40:07.421015 215 site-packages/torch/utils/cpp_extension.py:2414] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
Traceback (most recent call last):
File "", line 2, in
File "", line 35, in
File "/pytorch/ao/setup.py", line 544, in
ext_modules=get_extensions(),
File "/pytorch/ao/setup.py", line 432, in get_extensions
cuda_arch_flags = _get_cuda_arch_flags()
File "/opt/conda/envs/venv/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 2434, in _get_cuda_arch_flags
arch_list[-1] += '+PTX'
IndexError: list index out of range

drisspg · 2025-05-19T16:48:51Z

Taking a look

drisspg · 2025-05-19T17:32:09Z

Okay so this is coming from this line;

>>> from torch.utils.cpp_extension import _get_cuda_arch_flags
>>> _get_cuda_arch_flags()
/Users/drisspg/.conda/envs/nightly/lib/python3.13/site-packages/torch/utils/cpp_extension.py:2410:
UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilat
ion.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
  warnings.warn(
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    _get_cuda_arch_flags()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/Users/drisspg/.conda/envs/nightly/lib/python3.13/site-packages/torch/utils/cpp_extension.p
y", line 2430, in _get_cuda_arch_flags
    arch_list[-1] += '+PTX'
    ~~~~~~~~~^^^^
IndexError: list index out of range

When you are calling get_arch_list with no args and the default system arch is not picked up with this logic:

https://github.com/pytorch/pytorch/blob/6487ea30b3fb3fe550d0e8e7feaf25bc3cffb626/torch/utils/cpp_extension.py#L2360

drisspg · 2025-05-22T20:15:58Z

@jeffdaily Can you rebase I am still alittle confused by this CI

facebook-github-bot · 2025-05-27T21:48:22Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-05-27T23:47:04Z

@mxz297 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

* [reland2][ROCm] preshuffled weight mm * remove debug print statements * remove duplicate registrations caused by patch fuzzing * lint * ruff

[reland2][ROCm] preshuffled weight mm

dc0d4e0

pytorch-bot Bot added ci-no-td device: rocm labels May 13, 2025

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 13, 2025

remove debug print statements

fa813ce

petrex added the ciflow/rocm label May 14, 2025

pytorch-bot Bot removed the ciflow/rocm label May 14, 2025

remove duplicate registrations caused by patch fuzzing

b7aa777

petrex added the ciflow/rocm label May 14, 2025

lint

f4ec46d

pytorch-bot Bot removed the ciflow/rocm label May 15, 2025

drisspg added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label May 16, 2025

Merge branch 'main' into rocm_swizzle_reland2

b4115d3

drisspg mentioned this pull request May 19, 2025

Manually specify flags if no arch set #2219

Closed

Merge branch 'main' into rocm_swizzle_reland2

204cea8

ruff

aca48ed

mxz297 merged commit 63f2e51 into pytorch:main May 28, 2025
36 of 37 checks passed

liangel-02 pushed a commit that referenced this pull request Aug 25, 2025

[reland2][ROCm] preshuffled weight mm (#2207)

778da8d

* [reland2][ROCm] preshuffled weight mm * remove debug print statements * remove duplicate registrations caused by patch fuzzing * lint * ruff

Conversation

jeffdaily commented May 13, 2025

Uh oh!

pytorch-bot Bot commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2207

⏳ No Failures, 1 Pending

Uh oh!

facebook-github-bot commented May 13, 2025

Uh oh!

pytorch-bot Bot commented May 14, 2025

Uh oh!

facebook-github-bot commented May 14, 2025

Uh oh!

facebook-github-bot commented May 14, 2025

Uh oh!

mxz297 commented May 14, 2025

Uh oh!

facebook-github-bot commented May 14, 2025

Uh oh!

facebook-github-bot commented May 14, 2025

Uh oh!

facebook-github-bot commented May 14, 2025

Uh oh!

mxz297 commented May 15, 2025

Uh oh!

facebook-github-bot commented May 15, 2025

Uh oh!

mxz297 commented May 15, 2025

Uh oh!

facebook-github-bot commented May 15, 2025

Uh oh!

drisspg commented May 16, 2025

Uh oh!

facebook-github-bot commented May 16, 2025

Uh oh!

mxz297 commented May 16, 2025

Uh oh!

pytorch-bot Bot commented May 16, 2025

Uh oh!

mxz297 commented May 16, 2025

Uh oh!

mxz297 commented May 16, 2025

Uh oh!

mxz297 commented May 16, 2025

Uh oh!

drisspg commented May 16, 2025

Uh oh!

mxz297 commented May 19, 2025

Uh oh!

drisspg commented May 19, 2025

Uh oh!

drisspg commented May 19, 2025

Uh oh!

drisspg commented May 22, 2025

Uh oh!

facebook-github-bot commented May 27, 2025

Uh oh!

facebook-github-bot commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot Bot commented May 13, 2025 •

edited

Loading