Skip to content

feat: optional infra setup for inference gateway#5453

Merged
vikramvs-gg merged 1 commit into
GoogleCloudPlatform:developfrom
jessicaochen:plumb-inference-gateway
May 20, 2026
Merged

feat: optional infra setup for inference gateway#5453
vikramvs-gg merged 1 commit into
GoogleCloudPlatform:developfrom
jessicaochen:plumb-inference-gateway

Conversation

@jessicaochen

Copy link
Copy Markdown
Contributor

Allow user to opt into creating the infrastructure required for inference gateway. This includes setting the GKE cluster gateway setting to standard (already exists) & creating a proxy-only subnet of the used network (new in this PR).

Tested by:

  • (1) not specifying anything in gke-a4-deployment.yaml and seeing the usual default behavior of no gateway setting and subnet.
  • (2) specifying enable_inference_gateway=true in gke-a4-deployment.yaml and seeing both gateway setting enabled and subnet created.

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@jessicaochen jessicaochen requested review from a team and samskillman as code owners April 3, 2026 23:27
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enables users to opt into creating the necessary infrastructure for an inference gateway within the GKE A4 deployment. It achieves this by adding a toggleable variable and enhancing the underlying network module to support conditional subnetwork provisioning, ensuring that resources are only created when explicitly requested.

Highlights

  • Optional Inference Gateway Infrastructure: Introduced a new variable 'enable_inference_gateway' to allow users to optionally provision the infrastructure required for an inference gateway.
  • Conditional Subnet Creation: Updated the VPC module to support an 'enabled' flag for subnetworks, allowing the proxy-only subnet to be created conditionally based on the new variable.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions Bot added the external PR from external contributor label Apr 3, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to conditionally enable subnetworks within the VPC module by adding an enabled attribute to the subnetworks input. It also updates the GKE A4 example to utilize this feature for an inference gateway proxy-only subnet. Feedback was provided to ensure the internal enabled key is stripped from the subnetwork configuration before being passed to underlying modules to prevent potential 'unsupported attribute' errors.

Comment thread modules/network/vpc/main.tf Outdated
@jessicaochen jessicaochen force-pushed the plumb-inference-gateway branch from 4882773 to 899b6e6 Compare April 3, 2026 23:37
@jessicaochen

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to conditionally enable subnetworks within the VPC module by adding an optional 'enabled' field to the 'subnetworks' input variable. The logic in 'modules/network/vpc/main.tf' was updated to filter out disabled subnetworks before processing CIDR blocks and resource creation. Additionally, the 'gke-a4' example was updated to include a proxy-only subnet for an inference gateway, which is toggled via a new 'enable_inference_gateway' variable. I have no feedback to provide.

@vikramvs-gg

Copy link
Copy Markdown
Contributor

/gcbrun

@jessicaochen jessicaochen force-pushed the plumb-inference-gateway branch 2 times, most recently from 0bd48fd to 2f14bb1 Compare May 7, 2026 17:14
@jessicaochen

jessicaochen commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

Note in the latest force push, we only set inference gateway. The subnet stuff is now only comments

Comment thread examples/gke-a4/gke-a4.yaml Outdated
@Neelabh94 Neelabh94 added the release-key-new-features Added to release notes under the "Key New Features" heading. label May 7, 2026
@jessicaochen jessicaochen force-pushed the plumb-inference-gateway branch from 2f14bb1 to 2bad0ed Compare May 7, 2026 17:53
@vikramvs-gg vikramvs-gg self-requested a review May 7, 2026 18:44
@vikramvs-gg

Copy link
Copy Markdown
Contributor

/gcbrun

vikramvs-gg
vikramvs-gg previously approved these changes May 7, 2026
Comment thread examples/gke-a4/gke-a4.yaml
Allow user to opt into creating the infrastructure required for
inference gateway. This includes setting the GKE cluster gateway
setting to standard (already exists) & documenting how to create
creating a proxy-only subnet of the used network (new in this PR).

Tested by:
  * (1) not specifying anything in gke-a4-deployment.yaml and seeing
the usual default behavior of gateway enabled.
  * (2) specifying enable_inference_gateway=true in
    gke-a4-deployment.yaml and seeing gateway setting enabled.
@vikramvs-gg

Copy link
Copy Markdown
Contributor

/gcbrun

@vikramvs-gg vikramvs-gg merged commit a5eaebf into GoogleCloudPlatform:develop May 20, 2026
12 of 83 checks passed
kadupoornima pushed a commit to kadupoornima/cluster-toolkit that referenced this pull request May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external PR from external contributor release-key-new-features Added to release notes under the "Key New Features" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants