Skip to content

Make login nodes deployable independently of "controller"#3958

Merged
mr0re1 merged 1 commit into
GoogleCloudPlatform:developfrom
mr0re1:login_sep
Apr 18, 2025
Merged

Make login nodes deployable independently of "controller"#3958
mr0re1 merged 1 commit into
GoogleCloudPlatform:developfrom
mr0re1:login_sep

Conversation

@mr0re1

@mr0re1 mr0re1 commented Apr 17, 2025

Copy link
Copy Markdown
Collaborator

Motivation: - allow to provision login group independently of "controller monolith"

Changes:

  • Move login nodes infra provisioning into community/modules/internal/slurm-gcp/login module;
  • Remove login specific fields from stored config.yaml;
  • Don't provision login artifacts (startup scripts) in slurm_files modules;
  • Add separate "${bucket}/login_group_configs/${name}.yaml" config file for login nodes;
  • Add metadata key slurm_login_group for VMs to self identify;
  • Block login node setup on availability of both general config and group-specific config.

Breaking change - requires recreation of login nodes template and instances

NOTE: This PR could have been made into non-breaking change by:

  • Don't create separate internal/nodeset module, but for_each resource creation within controller module - doesn't break down monolith and makes it harder to reuse in other products;
  • Don't add slurm_login_group metadata, but rely on VM name parsing - less clean, require moderate amount of refactoring of python-code.

@mr0re1 mr0re1 requested review from a team and samskillman as code owners April 17, 2025 00:48
@mr0re1 mr0re1 self-assigned this Apr 17, 2025
@mr0re1 mr0re1 added the do-not-merge Block merging of this PR label Apr 17, 2025
@mr0re1 mr0re1 force-pushed the login_sep branch 7 times, most recently from aba2ad1 to edfc509 Compare April 18, 2025 17:59
@mr0re1 mr0re1 assigned harshthakkar01 and unassigned mr0re1 Apr 18, 2025
@mr0re1 mr0re1 added release-breaking-changes Prevents "smooth" re-deploy across versions and removed do-not-merge Block merging of this PR labels Apr 18, 2025
@mr0re1 mr0re1 added the do-not-merge Block merging of this PR label Apr 18, 2025
@mr0re1 mr0re1 assigned mr0re1 and unassigned harshthakkar01 Apr 18, 2025
@mr0re1 mr0re1 removed the do-not-merge Block merging of this PR label Apr 18, 2025
@mr0re1 mr0re1 merged commit 0e9a183 into GoogleCloudPlatform:develop Apr 18, 2025
@mr0re1 mr0re1 deleted the login_sep branch April 18, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-breaking-changes Prevents "smooth" re-deploy across versions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants