Skip to content

Fix: Slurm scripts incorrectly identifying all accelerators as GPUs#4498

Merged
sharabiani merged 2 commits into
GoogleCloudPlatform:developfrom
sharabiani:gres-fix-accel
Aug 14, 2025
Merged

Fix: Slurm scripts incorrectly identifying all accelerators as GPUs#4498
sharabiani merged 2 commits into
GoogleCloudPlatform:developfrom
sharabiani:gres-fix-accel

Conversation

@sharabiani

Copy link
Copy Markdown
Collaborator

The Slurm scripts, utils.py more specifically, considers all accelerators in an instance_template as gpu. This causes wrong gres.conf setup if the accelerator is TPU.

This PR filters accelerators type with "nvidia-" prefix for GPUs.

@sharabiani sharabiani requested review from a team and samskillman as code owners August 8, 2025 13:39
@sharabiani sharabiani added the release-bugfix Added to release notes under the "Bug fixes" heading. label Aug 8, 2025
@sharabiani sharabiani enabled auto-merge August 12, 2025 07:15
@sharabiani sharabiani merged commit c6a2956 into GoogleCloudPlatform:develop Aug 14, 2025
11 of 62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-bugfix Added to release notes under the "Bug fixes" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants