Skip to content

Remove benchmark dependencies from regular ROCm CI images#160047

Closed
huydhn wants to merge 2 commits intopytorch:mainfrom
huydhn:fix-torchbench-rocm
Closed

Remove benchmark dependencies from regular ROCm CI images#160047
huydhn wants to merge 2 commits intopytorch:mainfrom
huydhn:fix-torchbench-rocm

Conversation

@huydhn
Copy link
Copy Markdown
Contributor

@huydhn huydhn commented Aug 7, 2025

Instead, use a new pytorch-linux-jammy-rocm-n-py3-benchmarks image for Docker benchmark job. This addresses 2 issues:

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn requested a review from a team as a code owner August 7, 2025 00:40
@huydhn huydhn added the ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 label Aug 7, 2025
@huydhn huydhn requested a review from jeffdaily as a code owner August 7, 2025 00:40
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Aug 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160047

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit a2639c9 with merge base 2507ae6 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot Bot added module: rocm AMD GPU support for Pytorch topic: not user facing topic category labels Aug 7, 2025
@huydhn huydhn requested a review from izaitsevfb August 7, 2025 00:40
Comment thread .ci/docker/build.sh Outdated
@@ -192,6 +192,21 @@ case "$tag" in
UCC_COMMIT=${_UCC_COMMIT}
INDUCTOR_BENCHMARKS=yes
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works, although could be made more concise by having a condition like: if [[ $tag =~ "benchmarks" ]]; to avoid repetition

Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn
Copy link
Copy Markdown
Contributor Author

huydhn commented Aug 7, 2025

@pytorchbot merge -f 'ROCm signals looks ok'

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

jeffdaily added a commit to ROCm/pytorch that referenced this pull request Aug 12, 2025
Follow-up to pytorch#160047 that separated the rocm image into default CI and
benchmarks.
pytorchmergebot pushed a commit that referenced this pull request Aug 12, 2025
Follow-up to #160047 that separated the rocm image into default CI and benchmarks.

Pull Request resolved: #160421
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
chuanhaozhuge pushed a commit that referenced this pull request Aug 14, 2025
Follow-up to #160047 that separated the rocm image into default CI and benchmarks.

Pull Request resolved: #160421
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
chuanhaozhuge pushed a commit that referenced this pull request Aug 18, 2025
Follow-up to #160047 that separated the rocm image into default CI and benchmarks.

Pull Request resolved: #160421
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
can-gaa-hou pushed a commit to can-gaa-hou/pytorch that referenced this pull request Aug 22, 2025
Follow-up to pytorch#160047 that separated the rocm image into default CI and benchmarks.

Pull Request resolved: pytorch#160421
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…0047)

Instead, use a new `pytorch-linux-jammy-rocm-n-py3-benchmarks` image for Docker benchmark job.  This addresses 2 issues:

* The current ROCm failures in trunk w.r.t librosa version https://github.com/pytorch/pytorch/actions/runs/16789466749/job/47549950994 that TorchBench pulls in.
* Reduce the size of the regular ROCm CI images by removing TorchBench models, which is needed only for benchmarking jobs.

Pull Request resolved: pytorch#160047
Approved by: https://github.com/malfet, https://github.com/izaitsevfb
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Follow-up to pytorch#160047 that separated the rocm image into default CI and benchmarks.

Pull Request resolved: pytorch#160421
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 Merged module: rocm AMD GPU support for Pytorch test-config/default topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants