-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Is there an existing issue for this?
- I have searched the existing issues
What happened?
With the upgrade to prometheus-operator v0.85.0 we face the drastically increased CPU usage of prometheus-operator.
Using metric prometheus_operator_triggered_total{triggered_by="ConfigMap"} We observe that ConfigMap objects obviously trigger the reconciliation in "alertmanager" and "prometheus"
We have about 30 ConfigMap objects in the release namespace however only a couple of them are related to the prometheus-operator. However according to the "prometheus_operator_triggered_total{triggered_by="ConfigMap"}" the number of triggering ConfigMaps is also about 30.
The attribute watch_referenced_objects_in_all_namespaces is not set (default: false).
The operator is deployed with the following parameters:
- --kubelet-service=kube-system/kube-prometheus-stack-kubelet - --kubelet-endpoints=true - --kubelet-endpointslice=false - --log-format=json - --log-level=info - --namespaces=ops,kube-system - --localhost=127.0.0.1 - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.85.0 - --config-reloader-cpu-request=5m - --config-reloader-cpu-limit=0 - --config-reloader-memory-request=50Mi - --config-reloader-memory-limit=150Mi - --alertmanager-instance-namespaces=foo - --alertmanager-config-namespaces=foo - --prometheus-instance-namespaces=foo - --thanos-default-base-image=quay.io/thanos/thanos:v0.39.2 - --thanos-ruler-instance-namespaces=foo - --secret-field-selector=type!=kubernetes.io/dockercfg,type!=kubernetes.io/service-account-token,type!=helm.sh/release.v1 - --web.enable-tls=true - --web.cert-file=/cert/cert - --web.key-file=/cert/key - --web.listen-address=:10250 - --web.tls-min-version=VersionTLS13
For now the watched Secret objects can be filtered with secretFieldSelector, as referenced here
| options.FieldSelector = config.SecretListWatchFieldSelector.String() |
However there is no such option for the ConfigMaps.
It seems that all the ConfigMap objects in the namespace are watched.
If there is additional configuration we should apply and have overseen by now in order to filter the ConfigMaps objects - it would be great to get help here.
Steps to Reproduce
- deploy prometheus-operator in a version < 0.85.0
- deploy ConfigMaps in the release namespace
- upgrade prometheus-operator to the version v0.85.0
Expected Result
- reasonable variations in the CPU usage
Actual Result
- drastic increase in the CPU usage (~2m -> ~100m)
Prometheus Operator Version
0.85.0Kubernetes Version
v1.31.11Kubernetes Cluster Type
EKS
How did you deploy Prometheus-Operator?
helm chart:prometheus-community/kube-prometheus-stack
Manifests
prometheus-operator log output
no meaningful logs for this issueAnything else?
No response