Skip to content

Fix missing region#4113

Merged
mr0re1 merged 1 commit into
GoogleCloudPlatform:developfrom
wiktorn:fix_missing_region
May 15, 2025
Merged

Fix missing region#4113
mr0re1 merged 1 commit into
GoogleCloudPlatform:developfrom
wiktorn:fix_missing_region

Conversation

@wiktorn

@wiktorn wiktorn commented May 13, 2025

Copy link
Copy Markdown
Contributor

Fixes error (when using multivpc networks and provider overrides): Error: cannot determine self_link for subnetwork "***":

Cannot determine region: set in this resource, or set provider-level 'region' or 'zone'.

  with module.slurm_controller.module.slurm_nodeset_template["debugnodeset"].module.instance_template.google_compute_instance_template.tpl,
  on modules/embedded/community/modules/internal/slurm-gcp/internal_instance_template/main.tf line 70, in resource "google_compute_instance_template" "tpl":
  70: resource "google_compute_instance_template" "tpl" {

This is a better way to fix this than #3046

Added a validation to variables to prevent this types of errors in other places / future.

This error is present, if provider_overrides are present in the blueprint, such as:

deployment_groups:
- group: primary
  terraform_providers:
    google:
      source: "hashicorp/google"
      version: ">= 6.0.0, != 6.13.0, < 7.0.0"
    google-beta:
      source: "hashicorp/google-beta"
      version: ">= 6.0.0, != 6.13.0, < 7.0.0"

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

Fixes error (when using multivpc networks and provider overrides):
Error: cannot determine self_link for subnetwork "***": Cannot determine region: set in this resource, or set provider-level 'region' or 'zone'.

  with module.slurm_controller.module.slurm_nodeset_template["debugnodeset"].module.instance_template.google_compute_instance_template.tpl,
  on modules/embedded/community/modules/internal/slurm-gcp/internal_instance_template/main.tf line 70, in resource "google_compute_instance_template" "tpl":
  70: resource "google_compute_instance_template" "tpl" {
@wiktorn wiktorn requested review from a team and samskillman as code owners May 13, 2025 08:12

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @wiktorn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses an issue where the region could not be determined when using multi-VPC networks and provider overrides. The fix involves ensuring that the region is always explicitly provided to the google_compute_instance_template resource. This is achieved by making the region variable mandatory in the instance_template and internal_instance_template modules and passing the region value from the partition.tf module.

Highlights

  • Bug Fix: Fixes the 'cannot determine self_link for subnetwork' error when using multi-VPC networks and provider overrides.
  • Variable Validation: Adds validation to the region variable in the instance_template and internal_instance_template modules to prevent similar errors in the future.
  • Module Modification: Modifies the partition.tf module to pass the region value to the slurm_nodeset_template module.

Changelog

  • community/modules/internal/slurm-gcp/instance_template/README.md
    • Updates the documentation to reflect that the region input variable is now required.
  • community/modules/internal/slurm-gcp/instance_template/variables.tf
    • Changes the region variable to be non-nullable, enforcing that a value must be provided.
  • community/modules/internal/slurm-gcp/internal_instance_template/README.md
    • Updates the documentation to reflect that the region input variable is now required.
  • community/modules/internal/slurm-gcp/internal_instance_template/variables.tf
    • Changes the region variable to be non-nullable, enforcing that a value must be provided.
  • community/modules/scheduler/schedmd-slurm-gcp-v6-controller/partition.tf
    • Passes the region value from the partition configuration to the slurm_nodeset_template module.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A region's need,
No longer can it be freed,
Must be defined now.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an error related to determining the region for instance templates, particularly in multi-VPC scenarios. The approach of making the region variable non-nullable in the instance_template and internal_instance_template modules is a good step towards ensuring a region is always defined. The updates to the README files correctly reflect these changes.

However, there's a critical point in how the region is supplied from the schedmd-slurm-gcp-v6-controller module's nodeset configuration that needs addressing.

Summary of Findings

  • Potential Null Value for Non-Nullable Region Variable: In community/modules/scheduler/schedmd-slurm-gcp-v6-controller/partition.tf, the region for slurm_nodeset_template is set to each.value.region. However, each.value.region can be null if not specified in the nodeset configuration, as it's an optional field. This null value would then be passed to the slurm_nodeset_template module where region is defined with nullable = false, causing a Terraform error. This needs to be addressed, for example, by using coalesce(each.value.region, var.region).

Merge Readiness

The pull request makes good progress in addressing the missing region issue by enforcing the region requirement in the template modules. However, there is a critical issue identified in community/modules/scheduler/schedmd-slurm-gcp-v6-controller/partition.tf where a null value could be passed for the region, leading to a Terraform error. This critical issue should be addressed before merging. I am not authorized to approve pull requests, so please ensure further review and approval once the suggested changes are made.

@harshthakkar01 harshthakkar01 added the release-bugfix Added to release notes under the "Bug fixes" heading. label May 13, 2025
@harshthakkar01

Copy link
Copy Markdown
Contributor

/gcbrun

@mr0re1 mr0re1 enabled auto-merge May 15, 2025 18:04
@mr0re1 mr0re1 merged commit 85a4a9b into GoogleCloudPlatform:develop May 15, 2025
13 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-bugfix Added to release notes under the "Bug fixes" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants