Skip to content

feat(storage): Enable GCS zonal bucket capability with RAPID storage.#5353

Merged
Neelabh94 merged 2 commits into
GoogleCloudPlatform:developfrom
Neelabh94:feature/zonal-bucket
Mar 18, 2026
Merged

feat(storage): Enable GCS zonal bucket capability with RAPID storage.#5353
Neelabh94 merged 2 commits into
GoogleCloudPlatform:developfrom
Neelabh94:feature/zonal-bucket

Conversation

@Neelabh94

Copy link
Copy Markdown
Contributor

This PR adds the following:

  • GCS Rapid Storage and Zonal Placement: Introduced support for the 'RAPID' storage class and 'placement_zones' in the 'cloud-storage-bucket' module, enabling the creation of zonal GCS buckets for high-performance workloads.

  • SLURM Rapid Storage Blueprint: Added a new example blueprint ('rapid-storage-slurm.yaml') demonstrating a SLURM cluster deployment with a GCS Rapid bucket and Anywhere Cache integration for optimized data access.

  • Testing for Zonal Buckets: Implemented new Ansible playbooks and a Kubernetes FIO test job template to validate zonal GCS buckets and their performance on both SLURM and GKE environments.

  • GKE Storage Example Update: Modified the existing 'storage-gke.yaml' example to include a zonal GCS bucket, alongside updates to its daily test configuration to ensure comprehensive validation.

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the HPC Toolkit by integrating support for high-performance Google Cloud Storage (GCS) zonal buckets, specifically leveraging the 'RAPID' storage class. It introduces new blueprints and updates existing examples to showcase how to deploy and utilize these high-throughput storage solutions with SLURM clusters and GKE environments, including Anywhere Cache integration. Comprehensive testing has also been added to ensure the functionality and performance of these new storage capabilities.

Highlights

  • GCS Zonal Bucket Support: Enabled the creation of GCS zonal buckets with the 'RAPID' storage class and 'placement_zones' in the 'cloud-storage-bucket' module, allowing for high-performance, low-latency data access.
  • New SLURM Rapid Storage Blueprint: Introduced a new example blueprint, 'rapid-storage-slurm.yaml', demonstrating a SLURM cluster deployment that leverages a GCS Rapid bucket and Anywhere Cache for optimized data access.
  • Enhanced Testing for Zonal Buckets: Implemented new Ansible playbooks and a Kubernetes FIO test job template to validate zonal GCS buckets and their performance across both SLURM and GKE environments.
  • GKE Storage Example Update: Modified the existing 'storage-gke.yaml' example to include a zonal GCS bucket and updated its daily test configuration to ensure comprehensive validation of the new capabilities.
Changelog
  • examples/README.md
    • Added 'rapid-storage-slurm.yaml' to the examples list.
    • Provided a detailed description for 'rapid-storage-slurm.yaml', highlighting GCS RAPID storage and Anywhere Cache support.
    • Updated the description for 'storage-gke.yaml' to reflect its new showcase of GCS RAPID storage and Anywhere Cache.
  • examples/rapid-storage-slurm.yaml
    • Added a new blueprint for deploying a SLURM cluster with a zonal GCS bucket using the 'RAPID' storage class and Anywhere Cache.
  • examples/storage-gke.yaml
    • Updated the default zone from 'us-central1-c' to 'us-central1-b'.
    • Added a new 'data-bucket-zonal' module configuration for a GCS bucket with 'RAPID' storage class and 'placement_zones'.
  • modules/file-system/cloud-storage-bucket/README.md
    • Documented the new 'placement_zones' input variable.
  • modules/file-system/cloud-storage-bucket/main.tf
    • Implemented a 'dynamic "custom_placement_config"' block to configure data locations based on the 'placement_zones' variable.
  • modules/file-system/cloud-storage-bucket/variables.tf
    • Added 'RAPID' to the allowed 'storage_class' values.
    • Introduced the 'placement_zones' variable with validation to ensure it's used with 'RAPID' or 'REGIONAL' storage classes and within the specified region.
  • tools/cloud-build/daily-tests/ansible_playbooks/test-validation/fio-test-job.yaml.j2
    • Added a new Jinja2 template for a Kubernetes Pod running FIO to test GCS bucket performance.
  • tools/cloud-build/daily-tests/ansible_playbooks/test-validation/test-slurm-rapid-storage.yml
    • Added a new Ansible playbook to validate SLURM rapid storage, including asserting storage class and placement zone, waiting for Anywhere Cache, installing FIO, running tests, and asserting performance thresholds.
  • tools/cloud-build/daily-tests/ansible_playbooks/test-validation/test-zonal-bucket.yml
    • Added a new Ansible playbook to validate GKE zonal buckets, including asserting storage class, rendering and deploying an FIO Kubernetes Pod, waiting for its completion, and asserting performance thresholds.
  • tools/cloud-build/daily-tests/builds/gke-storage.yaml
    • Added 'gcs-rapid' and 'anywhere-cache' tags.
    • Updated the 'zone' for the GKE storage example to 'us-central1-b'.
  • tools/cloud-build/daily-tests/builds/slurm-rapid-storage.yaml
    • Added a new daily test build configuration for the SLURM rapid storage blueprint, including relevant tags and a check for running builds.
  • tools/cloud-build/daily-tests/tests/gke-storage.yml
    • Updated the 'zone' to 'us-central1-b'.
    • Added 'test-validation/test-zonal-bucket.yml' to 'post_deploy_tests'.
    • Added 'zone' to 'cli_deployment_vars'.
  • tools/cloud-build/daily-tests/tests/slurm-rapid-storage.yaml
    • Added a new test configuration for the SLURM rapid storage blueprint, defining deployment variables and post-deploy tests.
  • tools/cloud-build/daily-tests/validate_tests_metadata.py
    • Added 'gcs-rapid' and 'anywhere-cache' to the list of valid tags.
    • Mapped 'slurm-rapid-storage.yaml' build path to its blueprint.
Activity
  • No human activity (comments, reviews, etc.) was detected on this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for GCS zonal buckets with the 'RAPID' storage class. The changes include updating the cloud-storage-bucket Terraform module with a new placement_zones variable and appropriate validation. A new example blueprint rapid-storage-slurm.yaml is added to demonstrate this feature, and the existing storage-gke.yaml example is updated. Comprehensive testing is included through new Ansible playbooks and a Kubernetes FIO test job. The documentation has also been updated accordingly.

My review has identified a couple of areas for improvement: one is a consistency issue in a new example blueprint regarding placeholder variables, and the other is an inconsistency in a new test playbook that could lead to incorrect test validation. Overall, the changes are well-implemented and tested.

Comment thread examples/rapid-storage-slurm.yaml Outdated
@Neelabh94 Neelabh94 added the release-key-new-features Added to release notes under the "Key New Features" heading. label Mar 13, 2026
@Neelabh94 Neelabh94 force-pushed the feature/zonal-bucket branch from 6fa4892 to 2a57f83 Compare March 13, 2026 04:45
@Neelabh94

Copy link
Copy Markdown
Contributor Author

@Neelabh94 Neelabh94 marked this pull request as ready for review March 13, 2026 06:42
@Neelabh94 Neelabh94 requested review from a team and samskillman as code owners March 13, 2026 06:42
Comment thread examples/rapid-storage-slurm.yaml
Comment thread tools/cloud-build/daily-tests/builds/slurm-rapid-storage.yaml Outdated
Comment thread examples/rapid-storage-slurm.yaml
@Neelabh94 Neelabh94 merged commit 8c02083 into GoogleCloudPlatform:develop Mar 18, 2026
15 of 77 checks passed
@Neelabh94 Neelabh94 deleted the feature/zonal-bucket branch March 20, 2026 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-key-new-features Added to release notes under the "Key New Features" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants