Skip to content

[ROCm][CI] Use gfx942 for rocm nightly binaries#175784

Closed
amdfaa wants to merge 2 commits intopytorch:mainfrom
amdfaa:patch-49
Closed

[ROCm][CI] Use gfx942 for rocm nightly binaries#175784
amdfaa wants to merge 2 commits intopytorch:mainfrom
amdfaa:patch-49

Conversation

@amdfaa
Copy link
Contributor

@amdfaa amdfaa commented Feb 25, 2026

Depends on #174290 and also tested in that PR here: https://github.com/pytorch/pytorch/actions/runs/22320747400/job/64886625635?pr=174290

Given the relative inconsistency of the mi250 runners (See here), we have decided to switch the nightly builds to the gfx942 runners.
cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @jataylo @hongxiayang @naromero77amd @pragupta @jerrymannil @xinyazhang

@amdfaa amdfaa requested a review from a team as a code owner February 25, 2026 20:17
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 25, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175784

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit de60c95 with merge base 85cf583 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: rocm AMD GPU support for Pytorch topic: not user facing topic category labels Feb 25, 2026
Copy link
Contributor

@atalman atalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@pytorch-bot pytorch-bot bot added the ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 label Feb 27, 2026
@jithunnair-amd
Copy link
Collaborator

@pytorchbot merge -f "Unrelated failure"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit to anatoliylitv/pytorch that referenced this pull request Mar 4, 2026
Depends on pytorch#174290 and also tested in that PR here: https://github.com/pytorch/pytorch/actions/runs/22320747400/job/64886625635?pr=174290

Given the relative inconsistency of the mi250 runners (See [here](https://hud.pytorch.org/hud/pytorch/pytorch/d9958fd/1?per_page=50&name_filter=rocm&mergeEphemeralLF=true)), we have decided to switch the nightly builds to the gfx942 runners.

Pull Request resolved: pytorch#175784
Approved by: https://github.com/atalman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 Merged module: rocm AMD GPU support for Pytorch open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants