Skip to content

Add Managed Lustre support for non-default ports (GKE compatibility)#4210

Merged
tpdownes merged 1 commit into
GoogleCloudPlatform:developfrom
tpdownes:managed_lustre
Jun 2, 2025
Merged

Add Managed Lustre support for non-default ports (GKE compatibility)#4210
tpdownes merged 1 commit into
GoogleCloudPlatform:developfrom
tpdownes:managed_lustre

Conversation

@tpdownes

@tpdownes tpdownes commented May 29, 2025

Copy link
Copy Markdown
Contributor

This commit enables Slurm clusters and VMs to use Managed Lustre instances originally configured for compatibility with GKE. Because Lustre and GKE use port 988 by default, Managed Lustre has a GKE-compatibility mode which adopts port 6988.

---
blueprint_name: test-lustre-port

vars:
  deployment_name: pr-4210
  region: us-central1
  zone: us-central1-c

deployment_groups:
- group: primary
  modules:
  - id: network
    source: modules/network/vpc
  - id: s0
    source: modules/scripts/startup-script
  - id: s1
    source: modules/scripts/startup-script
    settings:
      managed_lustre:
        # removing the explicit true should fail the validation block
        enabled: true
        port: 6998
  - id: vm0
    source: modules/compute/vm-instance
    use:
    - network
    - s0
    settings:
      machine_type: n2-standard-8
      name_prefix: vm0
  - id: vm1
    source: modules/compute/vm-instance
    use:
    - network
    - s1
    settings:
      machine_type: n2-standard-8
      name_prefix: vm1

On vm0, no changes are observed and Ansible is not installed. On vm1, the changes below are observed and Ansible is installed. Both are WAI.

[ext_tpdownes_google_com@vm1-0 modprobe.d]$ cat lnet.conf 
options lnet accept_port=6998

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@tpdownes tpdownes force-pushed the managed_lustre branch 3 times, most recently from 444f174 to 1f99f14 Compare June 2, 2025 15:37
@tpdownes tpdownes requested a review from abbas1902 June 2, 2025 15:39
@tpdownes tpdownes marked this pull request as ready for review June 2, 2025 15:40
@tpdownes tpdownes requested review from a team and samskillman as code owners June 2, 2025 15:40
@tpdownes tpdownes added the release-module-improvements Added to release notes under the "Module Improvements" heading. label Jun 2, 2025
@tpdownes tpdownes changed the title Managed Lustre Support Add Managed Lustre support for non-default ports (GKE compatibility) Jun 2, 2025
@tpdownes tpdownes enabled auto-merge June 2, 2025 15:40
@tpdownes tpdownes assigned tpdownes and unassigned abbas1902 Jun 2, 2025
This commit enables Slurm clusters and VMs to use Managed Lustre
instances originally configured for compatibility with GKE. Because
Lustre and GKE use port 988 by default, Managed Lustre has a
GKE-compatibility mode which adopts port 6988.
@tpdownes tpdownes assigned abbas1902 and unassigned tpdownes Jun 2, 2025

@abbas1902 abbas1902 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tpdownes tpdownes merged commit 84dc068 into GoogleCloudPlatform:develop Jun 2, 2025
13 of 66 checks passed
@abbas1902 abbas1902 assigned tpdownes and unassigned abbas1902 Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-module-improvements Added to release notes under the "Module Improvements" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants