Add kubectl provider in root module for blueprint with GKE cluster module setup#3406
Conversation
ankitkinra
left a comment
There was a problem hiding this comment.
Thanks for testing this out thoroughly, can you please add if you performed the following test:
Example blueprint with PV (storage)
Case 1
- First deploy of the blueprint on the
developbranch (without the changes) with PV - Switch branch and do make
- Second deploy of the blueprint without any changes , which would typically not result in any diff
Result: Did it cause recreation of kubectl created modules ? Specifically if blueprint had storage, did it cause data loss ?
Case 2
- First deploy of the blueprint on the
developbranch (without the changes) with PV - Switch branch and do make
- Second deploy of the blueprint with only nodepool scaleup , which would typically not result in recreation of any module.
Result: Did it cause recreation of kubectl created modules ? Specifically if blueprint had storage, did it cause data loss ?
Case 2:
Case 1
- First deploy of the blueprint on the
developbranch (without the changes) with PV - Switch branch and do make
- Second deploy of the blueprint without any changes , which would typically not result in any diff
- Third deploy of the blueprint of the above blueprint after completion of second deploy
Result
Did the third deploy of blueprint cause any updates / loss of data ?
Thanks for adding a detailed description of test case to further consider. Following are the result from the test, used storage-gke.yaml example blueprint. Case 1: No diff observed (except for local file output of Case 2: Same behavior observed as defined for Case 1 Case 3: Same behavior observed as defined for Case 1 |
sharabiani
left a comment
There was a problem hiding this comment.
LGTM
if @tpdownes @ankitkinra agree with the limit of only one gke-cluster per deployment group: #3406 (comment)
…cker default terraform version 1.5.2 -> 1.5.7
Why we need this change ?
When managing Kubernetes manifests with multiple blueprints using the kubectl-apply module, completely removing a kubectl-apply block from a blueprint results in a "Error: Provider configuration not present" error.
Similar issue is observed for PersistentVolume and PersistentVolumeClaims, when provisioning these resources using gke-persistent-volume module, which internally uses kubectl-apply module for k8s manifest deployment.
Removing workload_manager_config or persistent_volume and recreating and redeploying gives us the error:
RCA
The issue stems from how the provider is defined. Unlike other Terraform providers which are typically defined at the root level, the kubectl-apply provider is defined within the modules/management/kubectl-apply/providers.tf file inside the module itself.
This becomes problematic when deleting a blueprint configuration that uses this module. While the module's folder remains, Terraform effectively loses track of it and its associated provider. Consequently, Terraform can't determine the correct provider (kubectl_apply_manifests) to use when attempting to delete the resources associated with the removed module.
This behavior is due to Terraform's handling of child modules and their providers. You can find a more detailed explanation in this Stack Overflow answer: https://stackoverflow.com/a/58403262
Solution
Moving kubectl provider to the root module (under
TerraformProvidersblock) of the blueprint group, if the group contains dependent module ofgke-clusterorpre-existing-gke-clustermodule.Backward Compatibility Test
Tested this out for example blueprint (
storage-gke.yaml,gke-a3-highgpu.yaml), following are the steps.maketo update gcluster binary.--forcebased on user discretion.--forceflag and did the deployment using feature branch and it worked as expected.kubectl-applymodule related resource were required.How did we test this out ?
OKstatusSubmission Checklist
NOTE: Community submissions can take up to 2 weeks to be reviewed.
Please take the following actions before submitting this pull request.