Skip to content

cherry-pick: release-1.134: feat: enable VPA in cluster mode#6065

Merged
google-oss-prow[bot] merged 8 commits intoGoogleCloudPlatform:release-1.134from
xiaoweim:release-1.134.patch
Jan 20, 2026
Merged

cherry-pick: release-1.134: feat: enable VPA in cluster mode#6065
google-oss-prow[bot] merged 8 commits intoGoogleCloudPlatform:release-1.134from
xiaoweim:release-1.134.patch

Conversation

@xiaoweim
Copy link
Collaborator

@xiaoweim xiaoweim commented Jan 15, 2026

BRIEF Change description

Cherry-pick VPA (customer request) to the 1.134.4 patch version. This PR includes changes in both #5694 and #5784.

WHY do we need this change?

Special notes for your reviewer:

This feature was added after an updated version of controller routing and CCC/CC was used in KCC. So I manually tested the feature on the 1.134 patch via locally deployed a version of 1.134 patch KCC onto a GKE cluster.

====== Testing ======

  1. Create a GKE cluster for local testing.
$ gcloud container clusters create test-vpa-local --location us-central1
  1. Deploy the locally built KCC.
$ make deploy-kcc-standard
...
namespace/cnrm-system created
serviceaccount/cnrm-controller-manager created
serviceaccount/cnrm-deletiondefender created
serviceaccount/cnrm-resource-stats-recorder created
serviceaccount/cnrm-webhook-manager created
role.rbac.authorization.k8s.io/cnrm-deletiondefender-cnrm-system-role created
role.rbac.authorization.k8s.io/cnrm-webhook-cnrm-system-role created
clusterrole.rbac.authorization.k8s.io/cnrm-admin created
clusterrole.rbac.authorization.k8s.io/cnrm-deletiondefender-role created
clusterrole.rbac.authorization.k8s.io/cnrm-manager-cluster-role created
clusterrole.rbac.authorization.k8s.io/cnrm-manager-ns-role created
clusterrole.rbac.authorization.k8s.io/cnrm-recorder-role created
clusterrole.rbac.authorization.k8s.io/cnrm-viewer created
clusterrole.rbac.authorization.k8s.io/cnrm-webhook-role created
rolebinding.rbac.authorization.k8s.io/cnrm-deletiondefender-role-binding created
rolebinding.rbac.authorization.k8s.io/cnrm-webhook-role-binding created
clusterrolebinding.rbac.authorization.k8s.io/cnrm-admin-binding created
clusterrolebinding.rbac.authorization.k8s.io/cnrm-deletiondefender-binding created
clusterrolebinding.rbac.authorization.k8s.io/cnrm-manager-binding created
clusterrolebinding.rbac.authorization.k8s.io/cnrm-manager-watcher-binding created
clusterrolebinding.rbac.authorization.k8s.io/cnrm-recorder-binding created
clusterrolebinding.rbac.authorization.k8s.io/cnrm-webhook-binding created
service/cnrm-deletiondefender created
service/cnrm-manager created
service/cnrm-resource-stats-recorder-service created
deployment.apps/cnrm-resource-stats-recorder created
deployment.apps/cnrm-webhook-manager created
statefulset.apps/cnrm-controller-manager created
statefulset.apps/cnrm-deletiondefender created
  1. Enable VPA on the cluster and verify the CRD is installed.
$ gcloud container clusters update test-vpa-134 --zone=us-central1 --enable-vertical-pod-autoscaling --zone=us-central1 --enable-vertical-pod-autoscaling
Updating test-vpa-134...done.                                                                                                
Updated [https://container.googleapis.com/v1/projects/xiaoweim-gke-dev/zones/us-central1/clusters/test-vpa-134].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1/test-vpa-134?project=xiaoweim-gke-dev

$ kubectl get crd verticalpodautoscalers.autoscaling.k8s.io
NAME                                        CREATED AT
verticalpodautoscalers.autoscaling.k8s.io   2026-01-16T21:17:51Z

Namespace Mode

  1. Set up the cluster CC, create the namespace and set up CCC and workload identity.
$ kubectl apply -f configconnector.yaml 
configconnector.core.cnrm.cloud.google.com/configconnector.core.cnrm.cloud.google.com created

$ cat configconnector.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnector
metadata:
  # the name is restricted to ensure that there is only ConfigConnector resource installed in your cluster
  name: configconnector.core.cnrm.cloud.google.com
spec:
  mode: namespaced
  stateIntoSpec: Absent

$ kubectl create ns vpa-ns-test
namespace/vpa-ns-test created

$ kubectl get ns
NAME                              STATUS   AGE
cnrm-system                       Active   4m7s
configconnector-operator-system   Active   4m11s
default                           Active   24m
gke-managed-cim                   Active   22m
gke-managed-system                Active   22m
gke-managed-volumepopulator       Active   22m
gmp-public                        Active   22m
gmp-system                        Active   22m
kube-node-lease                   Active   24m
kube-public                       Active   24m
kube-system                       Active   24m
vpa-ns-test                       Active   6s

$ cat configconnectorcontext.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnectorContext
metadata:
  # you need one ConfigConnectorContext per namespace
  name: configconnectorcontext.core.cnrm.cloud.google.com
  namespace: vpa-ns-test-3
spec:
  googleServiceAccount: "test-kcc@xiaoweim-gke-dev.iam.gserviceaccount.com"
  stateIntoSpec: Absent

$ kubectl apply -f configconnectorcontext.yaml
configconnectorcontext.core.cnrm.cloud.google.com/configconnectorcontext.core.cnrm.cloud.google.com created

$ kubectl get configconnectorcontext  configconnectorcontext.core.cnrm.cloud.google.com -n vpa-ns-test
NAME                                                AGE   HEALTHY
configconnectorcontext.core.cnrm.cloud.google.com   35s   true

$ gcloud iam service-accounts add-iam-policy-binding     test-kcc@xiaoweim-gke-dev.iam.gserviceaccount.com     --member="serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-vpa-ns-test]"     --role="roles/iam.workloadIdentityUser"
Updated IAM policy for serviceAccount [test-kcc@xiaoweim-gke-dev.iam.gserviceaccount.com].
bindings:
- members:
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-vpa-ns-test-1]
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-vpa-ns-test-2]
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-vpa-ns-test-3]
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-vpa-ns-test]
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-vpa-ns]
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager-xw-test-1]
  - serviceAccount:xiaoweim-gke-dev.svc.id.goog[cnrm-system/cnrm-controller-manager]
  role: roles/iam.workloadIdentityUser
etag: BwZIh7hAqDw=
version: 1
  1. Record the CPU and Memory of the cnrm-controller-manager object in the vpa-ns-test namespace.
$ kubectl get pod -n cnrm-system cnrm-controller-manager-kw5thzkjkhbdupi-0 -o jsonpath='{.spec.containers[?(@.name=="manager")].resources}'
map[limits:map[memory:512Mi] requests:map[cpu:100m memory:512Mi]]
  1. Create the namespacedcontrollerresource object.
$ cat namespacedcontrollerresource.yaml
apiVersion: customize.core.cnrm.cloud.google.com/v1beta1
kind: NamespacedControllerResource
metadata:
  name: cnrm-controller-manager
  namespace: vpa-ns-test
spec:
  verticalPodAutoscalerMode: Enabled
  containers: []

$ kubectl apply -f namespacedcontrollerresource.yaml 
namespacedcontrollerresource.customize.core.cnrm.cloud.google.com/cnrm-controller-manager created
  1. Verify that the VPA object is created and the status field of the VerticalPodAutoscaler is populated by the VPA recommender.
$ kubectl get vpa -n cnrm-system cnrm-controller-manager-kw5thzkjkhbdupi -oyaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  creationTimestamp: "2026-01-16T21:20:47Z"
  generation: 5
  managedFields:
  - apiVersion: autoscaling.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:targetRef:
          .: {}
          f:apiVersion: {}
          f:kind: {}
          f:name: {}
        f:updatePolicy:
          .: {}
          f:minReplicas: {}
          f:updateMode: {}
    manager: manager
    operation: Update
    time: "2026-01-16T21:20:47Z"
  - apiVersion: autoscaling.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:conditions: {}
        f:recommendation:
          .: {}
          f:containerRecommendations: {}
    manager: vpa-recommender
    operation: Update
    time: "2026-01-16T21:24:26Z"
  name: cnrm-controller-manager-kw5thzkjkhbdupi
  namespace: cnrm-system
  resourceVersion: "1768598666830735002"
  uid: d1973fba-290b-4997-9c11-4eb4944a95ff
spec:
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: cnrm-controller-manager-kw5thzkjkhbdupi
  updatePolicy:
    minReplicas: 1
    updateMode: Auto
status:
  conditions:
  - lastTransitionTime: "2026-01-16T21:21:26Z"
    status: "False"
    type: LowConfidence
  - lastTransitionTime: "2026-01-16T21:21:26Z"
    status: "True"
    type: RecommendationProvided
  recommendation:
    containerRecommendations:
    - containerName: manager
      lowerBound:
        cpu: 55m
        memory: "192937984"
      target:
        cpu: 75m
        memory: "234881024"
      uncappedTarget:
        cpu: 75m
        memory: "234881024"
      upperBound:
        cpu: 32130m
        memory: "24367857664"
    - containerName: prom-to-sd
      lowerBound:
        cpu: 1m
        memory: "9437184"
      target:
        cpu: 2m
        memory: "11534336"
      uncappedTarget:
        cpu: 2m
        memory: "11534336"
      upperBound:
        cpu: 955m
        memory: "1104150528"
  1. Verify the CPU and Memory of the cnrm-controller-manager object is consistent with the recommendation in the status field of the VerticalPodAutoscaler.
$ kubectl get pod -n cnrm-system cnrm-controller-manager-kw5thzkjkhbdupi-0 -o jsonpath='{.spec.containers[?(@.name=="manager")].resources}'
map[limits:map[cpu:70m memory:234881024] requests:map[cpu:70m memory:234881024]]

Cluster Mode

  1. Set up CC in cluster mode.
$ cat configconnector.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnector
metadata:
  # the name is restricted to ensure that there is only ConfigConnector resource installed in your cluster
  name: configconnector.core.cnrm.cloud.google.com
spec:
  mode: cluster
  googleServiceAccount: "test-kcc@xiaoweim-gke-dev.iam.gserviceaccount.com"
  stateIntoSpec: Absent

$ kubectl apply -f configconnector.yaml 
configconnector.core.cnrm.cloud.google.com/configconnector.core.cnrm.cloud.google.com created
xiaoweim@xiaoweim:~/test-kcc$ cat configconnector.yaml
  1. Record the CPU and Memory of the cnrm-controller-manager object.
$ kubectl get pod cnrm-controller-manager-0 -n cnrm-system -o jsonpath='{.spec.containers[?(@.name=="manager")].resources}'
map[limits:map[memory:512Mi] requests:map[cpu:100m memory:512Mi]]
  1. Create the controllerresource object.
$ cat controllerresource.yaml
apiVersion: customize.core.cnrm.cloud.google.com/v1beta1
kind: ControllerResource
metadata:
  name: cnrm-controller-manager
spec:
  verticalPodAutoscalerMode: Enabled
  containers: []

$ kubectl apply -f controllerresource.yaml 
controllerresource.customize.core.cnrm.cloud.google.com/cnrm-controller-manager created
  1. Verify that the VPA object is created and the status field of the VerticalPodAutoscaler is populated by the VPA recommender.
$ kubectl get vpa -n cnrm-system cnrm-controller-manager -oyaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  creationTimestamp: "2026-01-16T21:45:59Z"
  generation: 2
  managedFields:
  - apiVersion: autoscaling.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:targetRef:
          .: {}
          f:apiVersion: {}
          f:kind: {}
          f:name: {}
        f:updatePolicy:
          .: {}
          f:minReplicas: {}
          f:updateMode: {}
    manager: manager
    operation: Update
    time: "2026-01-16T21:45:59Z"
  - apiVersion: autoscaling.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:conditions: {}
        f:recommendation:
          .: {}
          f:containerRecommendations: {}
    manager: vpa-recommender
    operation: Update
    time: "2026-01-16T21:46:26Z"
  name: cnrm-controller-manager
  namespace: cnrm-system
  resourceVersion: "1768599986844095008"
  uid: f21f5da6-91fd-4959-b87f-d9bff8cc9463
spec:
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: cnrm-controller-manager
  updatePolicy:
    minReplicas: 1
    updateMode: Auto
status:
  conditions:
  - lastTransitionTime: "2026-01-16T21:46:26Z"
    message: Some containers have a small number of samples
    reason: prom-to-sd,manager
    status: "True"
    type: LowConfidence
  - lastTransitionTime: "2026-01-16T21:46:26Z"
    status: "True"
    type: RecommendationProvided
  recommendation:
    containerRecommendations:
    - containerName: prom-to-sd
      lowerBound:
        cpu: 1m
        memory: "5242880"
      target:
        cpu: 1m
        memory: "11534336"
      uncappedTarget:
        cpu: 1m
        memory: "11534336"
      upperBound:
        cpu: 595m
        memory: "5481955328"
    - containerName: manager
      lowerBound:
        cpu: 20m
        memory: "108003328"
      target:
        cpu: 50m
        memory: "247463936"
      uncappedTarget:
        cpu: 50m
        memory: "247463936"
      upperBound:
        cpu: 24260m
        memory: 126877696k
  1. Verify the CPU and Memory of the cnrm-controller-manager object is consistent with the recommendation in the status field of the VerticalPodAutoscaler.
$ kubectl get pod cnrm-controller-manager-0 -n cnrm-system -o jsonpath='{.spec.containers[?(@.name=="manager")].resources}'
map[limits:map[cpu:50m memory:247463936] requests:map[cpu:50m memory:247463936]]

===== Testing Done =====

Does this PR add something which needs to be 'release noted'?


  • Reviewer reviewed release note.

Additional documentation e.g., references, usage docs, etc.:


Intended Milestone

Please indicate the intended milestone.

  • Reviewer tagged PR with the actual milestone.

Tests you have done

  • Run make ready-pr to ensure this PR is ready for review.
  • Perform necessary E2E testing for changed resources.

@google-cla
Copy link

google-cla bot commented Jan 15, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@xiaoweim xiaoweim changed the base branch from master to release-1.134 January 15, 2026 22:39
@xiaoweim xiaoweim marked this pull request as ready for review January 15, 2026 22:39
@xiaoweim xiaoweim added this to the 1.134.patch milestone Jan 15, 2026
@xiaoweim
Copy link
Collaborator Author

xiaoweim commented Jan 15, 2026

@xiaoweim xiaoweim force-pushed the release-1.134.patch branch from e2db00c to d703588 Compare January 15, 2026 22:53
@xiaoweim xiaoweim force-pushed the release-1.134.patch branch from 630aa8a to 8d9e39a Compare January 16, 2026 21:59
@cheftako
Copy link
Collaborator

/lgtm
/approve

@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheftako

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit aeb69db into GoogleCloudPlatform:release-1.134 Jan 20, 2026
310 of 312 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants