fix(gke): Missing Pathways Quotas in Kueue#5645
Conversation
There was a problem hiding this comment.
Code Review
This pull request parameterizes the enable_pathways_for_tpus setting across multiple GKE TPU blueprints by introducing a global variable. However, feedback indicates that in DWS-based blueprints, a naming mismatch between the defined dws-cluster-queue and the hardcoded cluster-queue in the kubectl-apply module will cause scheduling failures. Additionally, the current shallow merging logic in the kubectl-apply module risks dropping critical configuration fields such as cohort or preemption, suggesting a need for a deep merge implementation.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request standardizes the enable_pathways_for_tpus configuration across several GKE TPU blueprints by introducing a top-level variable and linking it to the relevant modules. It also renames a ClusterQueue in a template and updates the kubectl-apply module to merge all spec fields for ClusterQueue resources. Feedback was provided to extend this merging logic to the metadata field to ensure user-defined labels and annotations are preserved.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the enable_pathways_for_tpus variable across several GKE TPU blueprints to automate CPU node pool provisioning and Kueue quota configuration. It also updates the kubectl-apply module to support merging ClusterQueue metadata and specifications. Review feedback identifies a critical need to ensure all template files are updated to the new cluster-queue name to avoid scheduling failures. Additionally, the Terraform logic in kubectl-apply should be hardened to handle null values during the merge process.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request standardizes the configuration of Pathways for TPUs across multiple GKE blueprints by introducing a top-level enable_pathways_for_tpus variable and wiring it through the gke-cluster and kubectl-apply modules. It also includes a rename of the cluster queue in the DWS flex-start template and updates the kubectl-apply module to better merge ClusterQueue resources. Review feedback highlights a missing variable wiring in the gke-tpu-v6e example and recommends flattening admissionChecks during the merge process to prevent configuration overwrites.
a879901
into
GoogleCloudPlatform:develop
shubpal07
left a comment
There was a problem hiding this comment.
We may need to audit All Templates to ensure kueue-configuration.yaml.tftpl and all other blueprint-specific templates are updated to the new cluster-queue name. For example https://github.com/GoogleCloudPlatform/cluster-toolkit/blob/main/examples/gke-consumption-options/dws-flex-start-queued-provisioning/nccl-jobset-example.yaml#L21 references dws-local-queue
We may also need documentation updates in some readme which references dws-local-queue. Ex- https://github.com/GoogleCloudPlatform/cluster-toolkit/blob/main/examples/gke-consumption-options/dws-flex-start-queued-provisioning/README.md
Thanks for the review, Shubham. To clarify, the name of the LocalQueue remains dws-local-queue across all templates and documentation. This PR specifically standardizes the name of the underlying ClusterQueue to cluster-queue (the resource referenced by the LocalQueue) to enable the seamless merging of Pathways quota. Since job templates and user workloads interact with the LocalQueue name, existing workflows will continue to function without any modifications. |
Problem Summary:
The Issue: When deploying GKE TPU clusters with Pathways enabled (enable_pathways_for_tpus: true), jobs would fail to schedule and remain in a Pending or Suspended state. While the cluster successfully provisioned the necessary cpu-np node pool for Pathways, the Kueue ClusterQueue was only configured to cover google.com/tpu resources, lacking coverage for the cpu and memory requested by the Pathways head pods.
Root Cause: The problem stemmed from missing variable plumbing between blueprint modules and limitations in the ClusterQueue merging logic in the kubectl-apply module:
Solution:To resolve this comprehensively and seamlessly without resorting to hardcoded values in user templates:
Seamless Variable Plumbing:
This allows Cluster Toolkit to automatically wire the flag from gke-cluster to kubectl-apply via the use mechanism, removing the need to manually pass it in the blueprint for kubectl-apply.
Robust ClusterQueue Merging:
Template Alignment:
This fix ensures that Pathways quotas are automatically and safely injected into the user's ClusterQueue across all 8 relevant TPU blueprints.
Submission Checklist
NOTE: Community submissions can take up to 2 weeks to be reviewed.
Please take the following actions before submitting this pull request.