Support Compute Endpoint Override for Slurm image building and cluster deployment#5493
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request adds support for custom Google Compute API endpoints and gcloud path overrides in the Packer custom image and startup-script modules, updates the googlecompute plugin version, and introduces variables for image licenses and regions. Feedback recommends extending the compute endpoint override to Packer's provisioner logic and using an OR operator in the startup script template to ensure the gcloud wrapper is created if either override is specified.
6662f78 to
917f607
Compare
576d4fa
into
GoogleCloudPlatform:develop
53 of 84 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR enables Cluster Toolkit to provide custom compute endpoint overrides for Slurm image building and cluster deployment.
Packer Upgrade: I upgraded the googlecompute Packer plugin to ~> 1.2.5, as older versions don't support the custom_endpoints variable.
Gcloud override & Compute endpoint Variables: I added gcloud_path_override and compute_endpoint_version as input variables to both the Packer and startup-script modules. This lets us pass these values directly from the blueprint.
Startup Script Wrapper: I added a gcloud wrapper in the startup script to set the custom compute endpoint. This ensures that when ansible-pull runs the slurm-gcp playbook to install Slurm, all underlying gcloud commands correctly use the provided override.
Customizable Licenses: I changed image_licenses field into a variable, to allow it to be overridden as the default production license URL may not be available.
Explicit Region: Introduced an explicit
regionvariable for thegooglecomputebuilder. Without this change, the logic truncated the zone name to infer the region name, which failed for zones that did not follow theregion-[a-z]naming convention.Testing
I verified these changes by building and deploying a4high, a4x, and g4 clusters. I also verified Rocky Linux 8 image building and cluster creation using the hpc-build-slurm-image.yaml blueprint.