Skip to content

A4 Slurm: enable sudo in Slurm jobs for users with OS Admin Login role#3961

Merged
tpdownes merged 1 commit into
GoogleCloudPlatform:developfrom
tpdownes:feat_add_oslogin_sudo_a4_slurm
Apr 17, 2025
Merged

A4 Slurm: enable sudo in Slurm jobs for users with OS Admin Login role#3961
tpdownes merged 1 commit into
GoogleCloudPlatform:developfrom
tpdownes:feat_add_oslogin_sudo_a4_slurm

Conversation

@tpdownes

Copy link
Copy Markdown
Contributor

This PR enables the use of password-free sudo within Slurm jobs on all compute nodes. The feature is restricted to users with OS Admin Login IAM role. Several A3 blueprints already enable this feature and it is missing in our A4 Slurm solution.

Ref: https://cloud.google.com/iam/docs/understanding-roles#compute.osAdminLogin

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@tpdownes tpdownes added the release-improvements Added to release notes under the "Improvements" heading. label Apr 17, 2025
@tpdownes tpdownes requested a review from samskillman April 17, 2025 15:19

@samskillman samskillman left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and like that it works on all partitions, not just the gpu partition which I think was the solution on A3H/M

@samskillman

Copy link
Copy Markdown
Collaborator

I have verified that this solution allows (upon first job submitted to a node) me to use sudo from within a job on the A4 compute nodes without any issue.

@tpdownes tpdownes marked this pull request as ready for review April 17, 2025 16:22
@tpdownes tpdownes requested a review from a team as a code owner April 17, 2025 16:22
@tpdownes tpdownes merged commit 1a588bf into GoogleCloudPlatform:develop Apr 17, 2025
@tpdownes tpdownes deleted the feat_add_oslogin_sudo_a4_slurm branch April 17, 2025 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-improvements Added to release notes under the "Improvements" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants