Skip to content

Added capacity checks for reservations#4372

Merged
parulbajaj01 merged 1 commit into
GoogleCloudPlatform:developfrom
PayalJakhar:validations
Jul 29, 2025
Merged

Added capacity checks for reservations#4372
parulbajaj01 merged 1 commit into
GoogleCloudPlatform:developfrom
PayalJakhar:validations

Conversation

@PayalJakhar

@PayalJakhar PayalJakhar commented Jul 9, 2025

Copy link
Copy Markdown
Contributor

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

What this change does

This change adds an early check (pre-validation) to make sure there's enough space (capacity) in a specific machine reservation before we try to create the cluster's nodes.

Why this is important

Right now, if you ask for more machines than are available in a reservation, the cluster setup process only fails much later, when it tries to actually create those machines. This is a problem because:

  • It wastes time and computing power trying to get machines that aren't there.
  • You don't find out about the capacity problem right away.

How it was implemented

I added a new section of code to the google_container_node_pool resource block to implement robust validation for reservation affinity. This involved creating two new precondition blocks. One to ensure a single reservation is specified when required, and another to verify that the requested node count does not exceed the available capacity of that reservation.

@PayalJakhar PayalJakhar requested review from a team and samskillman as code owners July 9, 2025 19:16
@PayalJakhar PayalJakhar changed the title added validation conditions for capacity checks in reservations Added capacity checks for reservations Jul 9, 2025
Comment thread pkg/validators/reservation.go Outdated
Comment thread pkg/validators/reservation.go Outdated
Comment thread pkg/validators/reservation.go Outdated
Comment thread pkg/validators/reservation.go Outdated
Comment thread pkg/validators/reservation.go Outdated
Comment thread pkg/validators/reservation.go Outdated
@PayalJakhar PayalJakhar force-pushed the validations branch 2 times, most recently from ccae270 to 4605f3d Compare July 14, 2025 09:02

@SwarnaBharathiMantena SwarnaBharathiMantena left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review this PR: #4057
This is a feature that lets a user specify a reservation that does not exist when the node count is 0.

Please update this validator to not conflict with this feature.

Also, is there a reason for adding this validator here instead of the module, under preconditions?

Comment thread pkg/validators/reservation.go Outdated
Comment thread pkg/validators/validators.go Outdated
@parulbajaj01

Copy link
Copy Markdown
Contributor

@PayalJakhar Let's maintain the existing structure and validate this condition using a preconditions block. We can leverage Terraform's data sources to check the available reservation capacity during the terraform plan phase.
Please explore around the same.

@parulbajaj01 parulbajaj01 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please mention how did you test the changes

@PayalJakhar

PayalJakhar commented Jul 16, 2025

Copy link
Copy Markdown
Contributor Author

I tried deploying gke-a3-mega cluster with static_node_count value to be greater than the available capacity of the specified reservation, which resulted in following error being thrown :-
Requested static_node_count (8) exceeds the available reservation capacity
(0).

@parulbajaj01 parulbajaj01 added the release-improvements Added to release notes under the "Improvements" heading. label Jul 24, 2025
Comment thread modules/compute/gke-node-pool/main.tf Outdated
Comment thread modules/compute/gke-node-pool/main.tf Outdated
@parulbajaj01

Copy link
Copy Markdown
Contributor

/gcbrun

1 similar comment
@parulbajaj01

Copy link
Copy Markdown
Contributor

/gcbrun

@parulbajaj01

Copy link
Copy Markdown
Contributor

/gcbrun

Comment thread modules/compute/gke-node-pool/main.tf
Comment thread modules/compute/gke-node-pool/main.tf Outdated
@SwarnaBharathiMantena

Copy link
Copy Markdown
Contributor

/gcbrun

@SwarnaBharathiMantena SwarnaBharathiMantena self-requested a review July 29, 2025 06:01

@SwarnaBharathiMantena SwarnaBharathiMantena left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@parulbajaj01 parulbajaj01 merged commit c0755fb into GoogleCloudPlatform:develop Jul 29, 2025
18 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-improvements Added to release notes under the "Improvements" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants