Skip to content

Enabling Spot VM For A3 Mega/High #4634

Merged
LAVEEN merged 1 commit into
GoogleCloudPlatform:developfrom
LAVEEN:a3provision
Sep 18, 2025
Merged

Enabling Spot VM For A3 Mega/High #4634
LAVEEN merged 1 commit into
GoogleCloudPlatform:developfrom
LAVEEN:a3provision

Conversation

@LAVEEN

@LAVEEN LAVEEN commented Sep 9, 2025

Copy link
Copy Markdown
Collaborator

Adding provisioning method for enabling spot vm and dws flex support to A3 Mega and A3 High

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@LAVEEN LAVEEN requested review from a team and samskillman as code owners September 9, 2025 17:54
@LAVEEN LAVEEN force-pushed the a3provision branch 2 times, most recently from 1ec7a9a to 1bbd5bb Compare September 10, 2025 12:57
Comment thread examples/machine-learning/a3-megagpu-8g/a3mega-slurm-deployment.yaml Outdated
Comment thread examples/machine-learning/a3-highgpu-8g/README.md Outdated
Comment thread examples/machine-learning/a3-highgpu-8g/ml-slurm-a3-2-cluster.yaml Outdated
@LAVEEN LAVEEN force-pushed the a3provision branch 4 times, most recently from 18f0cd9 to 6d42a57 Compare September 17, 2025 00:21
@LAVEEN LAVEEN self-assigned this Sep 17, 2025
@LAVEEN LAVEEN added the release-improvements Added to release notes under the "Improvements" heading. label Sep 17, 2025
Comment thread examples/machine-learning/a3-highgpu-8g/README.md Outdated
Comment thread examples/machine-learning/a3-highgpu-8g/README.md Outdated
@LAVEEN LAVEEN force-pushed the a3provision branch 3 times, most recently from 1ac865e to a87ef9a Compare September 17, 2025 23:45
abbas1902
abbas1902 previously approved these changes Sep 17, 2025
rachit-google
rachit-google previously approved these changes Sep 18, 2025
@LAVEEN LAVEEN dismissed stale reviews from rachit-google and abbas1902 via 59dfc8d September 18, 2025 05:17
@LAVEEN LAVEEN force-pushed the a3provision branch 2 times, most recently from 59dfc8d to 3a37b31 Compare September 18, 2025 05:31
@LAVEEN LAVEEN requested a review from bytetwin September 18, 2025 08:12
@LAVEEN LAVEEN dismissed bytetwin’s stale review September 18, 2025 08:22

The changes have been done

@LAVEEN LAVEEN merged commit 5c5fa04 into GoogleCloudPlatform:develop Sep 18, 2025
13 of 63 checks passed
Comment thread examples/machine-learning/a3-highgpu-8g/README.md
Comment thread examples/machine-learning/a3-highgpu-8g/README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-improvements Added to release notes under the "Improvements" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants