fix: use update-in-place for PrometheusRule ConfigMaps#8290
Merged
simonpasquier merged 1 commit intoprometheus-operator:mainfrom Jan 19, 2026
Merged
Conversation
Signed-off-by: Sanchit2662 <sanchit2662@gmail.com>
Contributor
Author
|
Hi @simonpasquier , |
simonpasquier
approved these changes
Jan 19, 2026
Contributor
simonpasquier
left a comment
There was a problem hiding this comment.
Thanks!
The process isn't 100% atomic and safe (e.g. the config-reloader might see a in-progress update of the configmaps or a to-be-deleted configmap) but I don't see a (simple) way to make it better.
alexlebens
pushed a commit
to alexlebens/infrastructure
that referenced
this pull request
Feb 6, 2026
…r to v0.89.0 (#3775) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [prometheus-operator/prometheus-operator](https://github.com/prometheus-operator/prometheus-operator) | minor | `v0.88.1` → `v0.89.0` | --- ### Release Notes <details> <summary>prometheus-operator/prometheus-operator (prometheus-operator/prometheus-operator)</summary> ### [`v0.89.0`](https://github.com/prometheus-operator/prometheus-operator/releases/tag/v0.89.0): 0.89.0 / 2026-02-05 [Compare Source](prometheus-operator/prometheus-operator@v0.88.1...v0.89.0) - \[ENHANCEMENT] Add `hostNetwork` field to the `Alertmanager` CRD. [#​8281](prometheus-operator/prometheus-operator#8281) - \[ENHANCEMENT] Add the `crds` and `full-crds` commands to the operator's binary. [#​8251](prometheus-operator/prometheus-operator#8251) - \[ENHANCEMENT] Report deprecated field usage in the `Reconciled` condition type. [#​8236](prometheus-operator/prometheus-operator#8236) - \[ENHANCEMENT] Avoid unnecessary reconciliation upon creation of the `ThanosRuler` StatefulSet. [#​8347](prometheus-operator/prometheus-operator#8347) - \[ENHANCEMENT] Add `bodySizeLimit` to the ScrapeConfig CRD. [#​8348](prometheus-operator/prometheus-operator#8348) - \[ENHANCEMENT] Support `http_headers` field in the Alertmanager Secret. [#​8357](prometheus-operator/prometheus-operator#8357) - \[ENHANCEMENT] Add the `-kubelet-http-metrics` flag to enable/disable the HTTP metrics port in the Kubelet endpoint (default=enabled). [#​8350](prometheus-operator/prometheus-operator#8350) - \[ENHANCEMENT] Include `operator.prometheus.io/version` annotation in the full version of CRDs. [#​8279](prometheus-operator/prometheus-operator#8279) - \[BUGFIX] Validate VictorOps global configuration in the `Alertmanager` CRD. [#​8020](prometheus-operator/prometheus-operator#8020) - \[BUGFIX] Validate Jira global configuration in the `Alertmanager` CRD. [#​8265](prometheus-operator/prometheus-operator#8265) - \[BUGFIX] Validate VictorOps receiver's URL in the `AlertmanagerConfig` CRD. [#​8258](prometheus-operator/prometheus-operator#8258) - \[BUGFIX] Validate Webex receiver's URL in the `AlertmanagerConfig` CRD. [#​8255](prometheus-operator/prometheus-operator#8255) - \[BUGFIX] Validate Jira receiver's URL configuration in the `AlertmanagerConfig` CRD. [#​8230](prometheus-operator/prometheus-operator#8230) - \[BUGFIX] Validate OpsGenie receiver configuration in the `AlertmanagerConfig` CRD. [#​8267](prometheus-operator/prometheus-operator#8267) - \[BUGFIX] Validate WeChat receiver configuration in the `AlertmanagerConfig` CRD. [#​8271](prometheus-operator/prometheus-operator#8271) - \[BUGFIX] Validate SNS receiver configuration in the `AlertmanagerConfig` CRD. [#​8217](prometheus-operator/prometheus-operator#8217) - \[BUGFIX] Validate Webex global configuration in the `Alertmanager` CRD. [#​7979](prometheus-operator/prometheus-operator#7979) - \[BUGFIX] Validate Telegram global configuration in the `Alertmanager` CRD. [#​8268](prometheus-operator/prometheus-operator#8268) - \[BUGFIX] Restore statefulset's labels if the creation fails with AlreadyExists. [#​8343](prometheus-operator/prometheus-operator#8343) - \[BUGFIX] Fix potential panic due to informer cache races. [#​8310](prometheus-operator/prometheus-operator#8310) - \[BUGFIX] Support probers defined with IPv6 addresses in the `Probe` CRD. [#​8354](prometheus-operator/prometheus-operator#8354) - \[BUGFIX] Prevent group and repeat intervals with zero duration from breaking Alertmanager. [#​8126](prometheus-operator/prometheus-operator#8126) - \[BUGFIX] Propagate all supported RocketChat attributes for `AlertmanagerConfig` CRD. [#​8016](prometheus-operator/prometheus-operator#8016) - \[BUGFIX] Add URL validation for WeChat receiver. [#​8256](prometheus-operator/prometheus-operator#8256) - \[BUGFIX] Add URL validation for SNS receiver. [#​8259](prometheus-operator/prometheus-operator#8259) - \[BUGFIX] Fix GCE service discovery for the `ScrapeConfig` CRD. [#​8284](prometheus-operator/prometheus-operator#8284) - \[BUGFIX] Avoid stale conditions in `Alertmanager`, `ThanosRuler`, `Prometheus` and `PrometheusAgent` resources. [#​8304](prometheus-operator/prometheus-operator#8304) - \[BUGFIX] Fix race condition when updating rule ConfigMaps. [#​8290](prometheus-operator/prometheus-operator#8290) - \[BUGFIX] Fix race condition when patching finalizers. [#​8323](prometheus-operator/prometheus-operator#8323) - \[BUGFIX] Reconcile `ScrapeConfig` resources when namespace selection changes. [#​8334](prometheus-operator/prometheus-operator#8334) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4zLjYiLCJ1cGRhdGVkSW5WZXIiOiI0My4zLjYiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbImltYWdlIl19--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/3775 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
renovate bot
added a commit
to sdwilsh/ansible-playbooks
that referenced
this pull request
Feb 21, 2026
…r to v0.89.0 ##### [\`v0.89.0\`](https://github.com/prometheus-operator/prometheus-operator/releases/tag/v0.89.0) - \[ENHANCEMENT] Add `hostNetwork` field to the `Alertmanager` CRD. [#8281](prometheus-operator/prometheus-operator#8281) - \[ENHANCEMENT] Add the `crds` and `full-crds` commands to the operator's binary. [#8251](prometheus-operator/prometheus-operator#8251) - \[ENHANCEMENT] Report deprecated field usage in the `Reconciled` condition type. [#8236](prometheus-operator/prometheus-operator#8236) - \[ENHANCEMENT] Avoid unnecessary reconciliation upon creation of the `ThanosRuler` StatefulSet. [#8347](prometheus-operator/prometheus-operator#8347) - \[ENHANCEMENT] Add `bodySizeLimit` to the ScrapeConfig CRD. [#8348](prometheus-operator/prometheus-operator#8348) - \[ENHANCEMENT] Support `http_headers` field in the Alertmanager Secret. [#8357](prometheus-operator/prometheus-operator#8357) - \[ENHANCEMENT] Add the `-kubelet-http-metrics` flag to enable/disable the HTTP metrics port in the Kubelet endpoint (default=enabled). [#8350](prometheus-operator/prometheus-operator#8350) - \[ENHANCEMENT] Include `operator.prometheus.io/version` annotation in the full version of CRDs. [#8279](prometheus-operator/prometheus-operator#8279) - \[BUGFIX] Validate VictorOps global configuration in the `Alertmanager` CRD. [#8020](prometheus-operator/prometheus-operator#8020) - \[BUGFIX] Validate Jira global configuration in the `Alertmanager` CRD. [#8265](prometheus-operator/prometheus-operator#8265) - \[BUGFIX] Validate VictorOps receiver's URL in the `AlertmanagerConfig` CRD. [#8258](prometheus-operator/prometheus-operator#8258) - \[BUGFIX] Validate Webex receiver's URL in the `AlertmanagerConfig` CRD. [#8255](prometheus-operator/prometheus-operator#8255) - \[BUGFIX] Validate Jira receiver's URL configuration in the `AlertmanagerConfig` CRD. [#8230](prometheus-operator/prometheus-operator#8230) - \[BUGFIX] Validate OpsGenie receiver configuration in the `AlertmanagerConfig` CRD. [#8267](prometheus-operator/prometheus-operator#8267) - \[BUGFIX] Validate WeChat receiver configuration in the `AlertmanagerConfig` CRD. [#8271](prometheus-operator/prometheus-operator#8271) - \[BUGFIX] Validate SNS receiver configuration in the `AlertmanagerConfig` CRD. [#8217](prometheus-operator/prometheus-operator#8217) - \[BUGFIX] Validate Webex global configuration in the `Alertmanager` CRD. [#7979](prometheus-operator/prometheus-operator#7979) - \[BUGFIX] Validate Telegram global configuration in the `Alertmanager` CRD. [#8268](prometheus-operator/prometheus-operator#8268) - \[BUGFIX] Restore statefulset's labels if the creation fails with AlreadyExists. [#8343](prometheus-operator/prometheus-operator#8343) - \[BUGFIX] Fix potential panic due to informer cache races. [#8310](prometheus-operator/prometheus-operator#8310) - \[BUGFIX] Support probers defined with IPv6 addresses in the `Probe` CRD. [#8354](prometheus-operator/prometheus-operator#8354) - \[BUGFIX] Prevent group and repeat intervals with zero duration from breaking Alertmanager. [#8126](prometheus-operator/prometheus-operator#8126) - \[BUGFIX] Propagate all supported RocketChat attributes for `AlertmanagerConfig` CRD. [#8016](prometheus-operator/prometheus-operator#8016) - \[BUGFIX] Add URL validation for WeChat receiver. [#8256](prometheus-operator/prometheus-operator#8256) - \[BUGFIX] Add URL validation for SNS receiver. [#8259](prometheus-operator/prometheus-operator#8259) - \[BUGFIX] Fix GCE service discovery for the `ScrapeConfig` CRD. [#8284](prometheus-operator/prometheus-operator#8284) - \[BUGFIX] Avoid stale conditions in `Alertmanager`, `ThanosRuler`, `Prometheus` and `PrometheusAgent` resources. [#8304](prometheus-operator/prometheus-operator#8304) - \[BUGFIX] Fix race condition when updating rule ConfigMaps. [#8290](prometheus-operator/prometheus-operator#8290) - \[BUGFIX] Fix race condition when patching finalizers. [#8323](prometheus-operator/prometheus-operator#8323) - \[BUGFIX] Reconcile `ScrapeConfig` resources when namespace selection changes. [#8334](prometheus-operator/prometheus-operator#8334) --- ##### [\`v0.88.1\`](https://github.com/prometheus-operator/prometheus-operator/releases/tag/v0.88.1) - \[BUGFIX] Validate `webhookURL` secret for `MSTeams` receiver in `AlertmanagerConfig` CRD. [#8294](prometheus-operator/prometheus-operator#8294) - \[BUGFIX] Revert maximum version check for `EC2/Lightsail` SD in `ScrapeConfig` CRD. [#8308](prometheus-operator/prometheus-operator#8308) - \[BUGFIX] Relax URL validation in `Slack` receiver in AlertmanagerConfig CRD to support Go templates. [#8299](prometheus-operator/prometheus-operator#8299) [#8331](prometheus-operator/prometheus-operator#8331) - \[BUGFIX] Relax URL validation in `PagerDuty` in AlertmanagerConfig CRD to support Go templates. [#8319](prometheus-operator/prometheus-operator#8319) - \[BUGFIX] Relax URL validation in `WebhookConfig` in AlertmanagerConfig CRD to support Go templates. [#8307](prometheus-operator/prometheus-operator#8307) [#8317](prometheus-operator/prometheus-operator#8317) - \[BUGFIX] Relax URL validation in `RocketChat` receiver in AlertmanagerConfig CRD to support Go templates. [#8318](prometheus-operator/prometheus-operator#8318) - \[BUGFIX] Relax URL validation in `Pushover` receiver in AlertmanagerConfig CRD to support Go templates. [#8307](prometheus-operator/prometheus-operator#8307) [#8316](prometheus-operator/prometheus-operator#8316)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes a non-atomic reconciliation pattern in
PrometheusRuleSyncer.Sync()that could cause temporary and silent loss of alerting and recording rules during PrometheusRule updates.Previously, the operator deleted all existing rule ConfigMaps before recreating them. This introduced a time window where no rule ConfigMaps existed, allowing the config-reloader sidecar to trigger a Prometheus reload with missing or partial rules. The issue is triggered during normal reconciliation and can impact production alerting without any visible errors.
This change makes PrometheusRule reconciliation continuous and safe by switching to update-in-place semantics.
Type of change
BUGFIXENHANCEMENTWhat was wrong (root cause)
PrometheusRuleSyncer.Sync()used a delete-then-create pattern for rule ConfigMaps:During this deletion window:
This happens during any PrometheusRule update, including label or selector changes.
Impact
Before this fix:
After this fix:
Fix summary
1. Added
CreateOrUpdateConfigMaphelperA new utility function was added to atomically create or update ConfigMaps:
Key properties:
retry.RetryOnConflict2. Rewrote
PrometheusRuleSyncer.Sync()The reconciliation order was changed to avoid rule disappearance:
Before
After
This mirrors the existing
CreateOrUpdateSecretpattern used for Alertmanager configuration.Verification