TagsTagKey and TagsTagValue controllers do not handle ALREADY_EXISTS in multi-cluster setups
Problem Description
When running Config Connector (KCC) across multiple clusters that manage the same GCP project, creating a TagsTagKey or TagsTagValue from a secondary cluster fails with an ALREADY_EXISTS error. This occurs because the resource was already successfully provisioned by the first cluster. Currently, the tag controllers propagate this error instead of acquiring the existing resource.
This is a common friction point in multi-cluster or active-active setups where shared GCP-global resources (like tag keys and values, which are unique per parent project/org) are declared in each cluster's configuration. Cluster 1 succeeds, but clusters 2-N fail with a permanent error:
Error waiting to create TagKey: Error waiting for Creating TagKey:
Error code 6, message: generic::ALREADY_EXISTS: A TagKey with short name 'X'
already exists under parent 'projects/Y'
Current Workaround
The only known workaround is to manually look up the server-generated numeric resourceID from GCP and hardcode it into spec.resourceID across all clusters. This is operationally painful and not easily automated using standard KCC patterns.
Proposed Solution
The TagsTagKey and TagsTagValue controllers should gracefully handle ALREADY_EXISTS errors by acquiring the existing GCP resource instead of failing. This allows KCC to adopt the resource properly if it was already created by another cluster.
When Create operations return an ALREADY_EXISTS error (HTTP 409 or gRPC code 6):
- Catch the error.
- List existing resources under the parent scope.
- Find the existing resource matching the desired
shortName (similar to how TagsTagBindingAdapter.Find() works).
- Adopt the resource by setting
spec.resourceID and updating the status accordingly.
This follows an established pattern in the KCC codebase (e.g., the AssetFeed controller located at pkg/controller/direct/asset/feed_controller.go), requires no CRD/API changes, and resolves the problem for multi-cluster users transparently.
Context / Related PRs
TagsTagKey and TagsTagValue controllers do not handle ALREADY_EXISTS in multi-cluster setups
Problem Description
When running Config Connector (KCC) across multiple clusters that manage the same GCP project, creating a
TagsTagKeyorTagsTagValuefrom a secondary cluster fails with anALREADY_EXISTSerror. This occurs because the resource was already successfully provisioned by the first cluster. Currently, the tag controllers propagate this error instead of acquiring the existing resource.This is a common friction point in multi-cluster or active-active setups where shared GCP-global resources (like tag keys and values, which are unique per parent project/org) are declared in each cluster's configuration. Cluster 1 succeeds, but clusters 2-N fail with a permanent error:
Current Workaround
The only known workaround is to manually look up the server-generated numeric
resourceIDfrom GCP and hardcode it intospec.resourceIDacross all clusters. This is operationally painful and not easily automated using standard KCC patterns.Proposed Solution
The
TagsTagKeyandTagsTagValuecontrollers should gracefully handleALREADY_EXISTSerrors by acquiring the existing GCP resource instead of failing. This allows KCC to adopt the resource properly if it was already created by another cluster.When
Createoperations return anALREADY_EXISTSerror (HTTP 409 or gRPC code 6):shortName(similar to howTagsTagBindingAdapter.Find()works).spec.resourceIDand updating the status accordingly.This follows an established pattern in the KCC codebase (e.g., the
AssetFeedcontroller located atpkg/controller/direct/asset/feed_controller.go), requires no CRD/API changes, and resolves the problem for multi-cluster users transparently.Context / Related PRs