fix: validate nonzero duration in AlertmanagerConfig#8126
fix: validate nonzero duration in AlertmanagerConfig#8126simonpasquier merged 20 commits intoprometheus-operator:mainfrom
Conversation
|
Thank you for the PR I think this would be better to handle at API level http://github.com/prometheus-operator/prometheus-operator/blob/a198a5ff2c4b99ec614d5f28463cc0b19d272430/pkg/apis/monitoring/v1alpha1/alertmanager_config_types.go#L110-L120 |
|
@ikartiksh Can you address #8126 (comment)? |
@slashpai i think it has been already addressed to handle at the api level. |
|
You would need to add validation prometheus-operator/pkg/apis/monitoring/v1alpha1/alertmanager_config_types.go Lines 110 to 120 in a198a5f ref: https://book.kubebuilder.io/reference/markers/crd-validation |
|
You would need to run Also lets create a new test in e2e test after You can create something similar to prometheus-operator/test/e2e/prometheus_test.go Line 4340 in ca5bf4a |
sure, i am onto the tests for the same. |
|
@slashpai i have added the e2e test for the code change. |
simonpasquier
left a comment
There was a problem hiding this comment.
Sorry for misleading you but it looks like group and repeat intervals with zero value are always rejected (not only for the top-level route).
One option could be that the operator "hides" the problem by not setting the field in the generated configuration if it detects a zero value but I'm not sure that we want this.
@simonpasquier should i just then apply the same changes to v1beta1 along with the positive duration function and the the operator hides the problem by not setting the field. |
simonpasquier
left a comment
There was a problem hiding this comment.
Ok I think that I've settled my mind :)
First think to note: zero durations are indeed forbidden for repeat/group intervals. So my plan would:
- Change the GroupInterval and RepeatInterval fields to
*monv1.Durationtype. It will avoid maintaining another regexp only for duration > 0. - As a consequence when we generate the final configuration,
*route.sanitize()should check the duration values and reset to the empty string if the input value is 0.
Bonus question: I suspect that the Alertmanager would also crash if we provision an 0 interval via the Kubernetes secret method?
|
thanks @simonpasquier for the approach to follow up and it is good to go. for the bonus question, yes the alertmanager would crash when provisioned with 0 internal via the k8s secret due to the misconfiguration. |
@simonpasquier while checking out I see that using |
I'm not sure to understand... Eventually route.sanitize should always do its best to verify that the assembled configuration is valid. |
79a5b43 to
f2eef2d
Compare
6798fa5 to
61cf5eb
Compare
|
@simonpasquier made all the required changes. |
|
Conclustion from our discussion after today's office hours:
|
|
@simonpasquier I think everything mentioned is already covered up and implemented in the code. |
Zero durations are converted to empty strings which makes use of the default value defined at a higher level in the routing tree. Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
…r to v0.89.0 (#3775) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [prometheus-operator/prometheus-operator](https://github.com/prometheus-operator/prometheus-operator) | minor | `v0.88.1` → `v0.89.0` | --- ### Release Notes <details> <summary>prometheus-operator/prometheus-operator (prometheus-operator/prometheus-operator)</summary> ### [`v0.89.0`](https://github.com/prometheus-operator/prometheus-operator/releases/tag/v0.89.0): 0.89.0 / 2026-02-05 [Compare Source](prometheus-operator/prometheus-operator@v0.88.1...v0.89.0) - \[ENHANCEMENT] Add `hostNetwork` field to the `Alertmanager` CRD. [#​8281](prometheus-operator/prometheus-operator#8281) - \[ENHANCEMENT] Add the `crds` and `full-crds` commands to the operator's binary. [#​8251](prometheus-operator/prometheus-operator#8251) - \[ENHANCEMENT] Report deprecated field usage in the `Reconciled` condition type. [#​8236](prometheus-operator/prometheus-operator#8236) - \[ENHANCEMENT] Avoid unnecessary reconciliation upon creation of the `ThanosRuler` StatefulSet. [#​8347](prometheus-operator/prometheus-operator#8347) - \[ENHANCEMENT] Add `bodySizeLimit` to the ScrapeConfig CRD. [#​8348](prometheus-operator/prometheus-operator#8348) - \[ENHANCEMENT] Support `http_headers` field in the Alertmanager Secret. [#​8357](prometheus-operator/prometheus-operator#8357) - \[ENHANCEMENT] Add the `-kubelet-http-metrics` flag to enable/disable the HTTP metrics port in the Kubelet endpoint (default=enabled). [#​8350](prometheus-operator/prometheus-operator#8350) - \[ENHANCEMENT] Include `operator.prometheus.io/version` annotation in the full version of CRDs. [#​8279](prometheus-operator/prometheus-operator#8279) - \[BUGFIX] Validate VictorOps global configuration in the `Alertmanager` CRD. [#​8020](prometheus-operator/prometheus-operator#8020) - \[BUGFIX] Validate Jira global configuration in the `Alertmanager` CRD. [#​8265](prometheus-operator/prometheus-operator#8265) - \[BUGFIX] Validate VictorOps receiver's URL in the `AlertmanagerConfig` CRD. [#​8258](prometheus-operator/prometheus-operator#8258) - \[BUGFIX] Validate Webex receiver's URL in the `AlertmanagerConfig` CRD. [#​8255](prometheus-operator/prometheus-operator#8255) - \[BUGFIX] Validate Jira receiver's URL configuration in the `AlertmanagerConfig` CRD. [#​8230](prometheus-operator/prometheus-operator#8230) - \[BUGFIX] Validate OpsGenie receiver configuration in the `AlertmanagerConfig` CRD. [#​8267](prometheus-operator/prometheus-operator#8267) - \[BUGFIX] Validate WeChat receiver configuration in the `AlertmanagerConfig` CRD. [#​8271](prometheus-operator/prometheus-operator#8271) - \[BUGFIX] Validate SNS receiver configuration in the `AlertmanagerConfig` CRD. [#​8217](prometheus-operator/prometheus-operator#8217) - \[BUGFIX] Validate Webex global configuration in the `Alertmanager` CRD. [#​7979](prometheus-operator/prometheus-operator#7979) - \[BUGFIX] Validate Telegram global configuration in the `Alertmanager` CRD. [#​8268](prometheus-operator/prometheus-operator#8268) - \[BUGFIX] Restore statefulset's labels if the creation fails with AlreadyExists. [#​8343](prometheus-operator/prometheus-operator#8343) - \[BUGFIX] Fix potential panic due to informer cache races. [#​8310](prometheus-operator/prometheus-operator#8310) - \[BUGFIX] Support probers defined with IPv6 addresses in the `Probe` CRD. [#​8354](prometheus-operator/prometheus-operator#8354) - \[BUGFIX] Prevent group and repeat intervals with zero duration from breaking Alertmanager. [#​8126](prometheus-operator/prometheus-operator#8126) - \[BUGFIX] Propagate all supported RocketChat attributes for `AlertmanagerConfig` CRD. [#​8016](prometheus-operator/prometheus-operator#8016) - \[BUGFIX] Add URL validation for WeChat receiver. [#​8256](prometheus-operator/prometheus-operator#8256) - \[BUGFIX] Add URL validation for SNS receiver. [#​8259](prometheus-operator/prometheus-operator#8259) - \[BUGFIX] Fix GCE service discovery for the `ScrapeConfig` CRD. [#​8284](prometheus-operator/prometheus-operator#8284) - \[BUGFIX] Avoid stale conditions in `Alertmanager`, `ThanosRuler`, `Prometheus` and `PrometheusAgent` resources. [#​8304](prometheus-operator/prometheus-operator#8304) - \[BUGFIX] Fix race condition when updating rule ConfigMaps. [#​8290](prometheus-operator/prometheus-operator#8290) - \[BUGFIX] Fix race condition when patching finalizers. [#​8323](prometheus-operator/prometheus-operator#8323) - \[BUGFIX] Reconcile `ScrapeConfig` resources when namespace selection changes. [#​8334](prometheus-operator/prometheus-operator#8334) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4zLjYiLCJ1cGRhdGVkSW5WZXIiOiI0My4zLjYiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbImltYWdlIl19--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/3775 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
…ator#8126) * add valid alertmanagerconfig * update validation check for global alertmanagerconfig object * add validation marker * add e2e test for amcfg * fix depedency issues * add sanitization logic & monitorv1duration field * update tests with monitorv1duration field logic * validate and update fields * update import dependencies * update unnecessary groupwait conversion * update generated files and formatting * update error handling && refine methods * nonemptyduration validation for groupwait * retrigger CI * update validation for groupwait * remove toplevel route validation for intervals * simplify handling of zero durations Zero durations are converted to empty strings which makes use of the default value defined at a higher level in the routing tree. Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Fix generated file Signed-off-by: Simon Pasquier <spasquie@redhat.com> * Fix test compilation Signed-off-by: Simon Pasquier <spasquie@redhat.com> --------- Signed-off-by: Simon Pasquier <spasquie@redhat.com> Co-authored-by: Simon Pasquier <spasquie@redhat.com>
…r to v0.89.0 ##### [\`v0.89.0\`](https://github.com/prometheus-operator/prometheus-operator/releases/tag/v0.89.0) - \[ENHANCEMENT] Add `hostNetwork` field to the `Alertmanager` CRD. [#8281](prometheus-operator/prometheus-operator#8281) - \[ENHANCEMENT] Add the `crds` and `full-crds` commands to the operator's binary. [#8251](prometheus-operator/prometheus-operator#8251) - \[ENHANCEMENT] Report deprecated field usage in the `Reconciled` condition type. [#8236](prometheus-operator/prometheus-operator#8236) - \[ENHANCEMENT] Avoid unnecessary reconciliation upon creation of the `ThanosRuler` StatefulSet. [#8347](prometheus-operator/prometheus-operator#8347) - \[ENHANCEMENT] Add `bodySizeLimit` to the ScrapeConfig CRD. [#8348](prometheus-operator/prometheus-operator#8348) - \[ENHANCEMENT] Support `http_headers` field in the Alertmanager Secret. [#8357](prometheus-operator/prometheus-operator#8357) - \[ENHANCEMENT] Add the `-kubelet-http-metrics` flag to enable/disable the HTTP metrics port in the Kubelet endpoint (default=enabled). [#8350](prometheus-operator/prometheus-operator#8350) - \[ENHANCEMENT] Include `operator.prometheus.io/version` annotation in the full version of CRDs. [#8279](prometheus-operator/prometheus-operator#8279) - \[BUGFIX] Validate VictorOps global configuration in the `Alertmanager` CRD. [#8020](prometheus-operator/prometheus-operator#8020) - \[BUGFIX] Validate Jira global configuration in the `Alertmanager` CRD. [#8265](prometheus-operator/prometheus-operator#8265) - \[BUGFIX] Validate VictorOps receiver's URL in the `AlertmanagerConfig` CRD. [#8258](prometheus-operator/prometheus-operator#8258) - \[BUGFIX] Validate Webex receiver's URL in the `AlertmanagerConfig` CRD. [#8255](prometheus-operator/prometheus-operator#8255) - \[BUGFIX] Validate Jira receiver's URL configuration in the `AlertmanagerConfig` CRD. [#8230](prometheus-operator/prometheus-operator#8230) - \[BUGFIX] Validate OpsGenie receiver configuration in the `AlertmanagerConfig` CRD. [#8267](prometheus-operator/prometheus-operator#8267) - \[BUGFIX] Validate WeChat receiver configuration in the `AlertmanagerConfig` CRD. [#8271](prometheus-operator/prometheus-operator#8271) - \[BUGFIX] Validate SNS receiver configuration in the `AlertmanagerConfig` CRD. [#8217](prometheus-operator/prometheus-operator#8217) - \[BUGFIX] Validate Webex global configuration in the `Alertmanager` CRD. [#7979](prometheus-operator/prometheus-operator#7979) - \[BUGFIX] Validate Telegram global configuration in the `Alertmanager` CRD. [#8268](prometheus-operator/prometheus-operator#8268) - \[BUGFIX] Restore statefulset's labels if the creation fails with AlreadyExists. [#8343](prometheus-operator/prometheus-operator#8343) - \[BUGFIX] Fix potential panic due to informer cache races. [#8310](prometheus-operator/prometheus-operator#8310) - \[BUGFIX] Support probers defined with IPv6 addresses in the `Probe` CRD. [#8354](prometheus-operator/prometheus-operator#8354) - \[BUGFIX] Prevent group and repeat intervals with zero duration from breaking Alertmanager. [#8126](prometheus-operator/prometheus-operator#8126) - \[BUGFIX] Propagate all supported RocketChat attributes for `AlertmanagerConfig` CRD. [#8016](prometheus-operator/prometheus-operator#8016) - \[BUGFIX] Add URL validation for WeChat receiver. [#8256](prometheus-operator/prometheus-operator#8256) - \[BUGFIX] Add URL validation for SNS receiver. [#8259](prometheus-operator/prometheus-operator#8259) - \[BUGFIX] Fix GCE service discovery for the `ScrapeConfig` CRD. [#8284](prometheus-operator/prometheus-operator#8284) - \[BUGFIX] Avoid stale conditions in `Alertmanager`, `ThanosRuler`, `Prometheus` and `PrometheusAgent` resources. [#8304](prometheus-operator/prometheus-operator#8304) - \[BUGFIX] Fix race condition when updating rule ConfigMaps. [#8290](prometheus-operator/prometheus-operator#8290) - \[BUGFIX] Fix race condition when patching finalizers. [#8323](prometheus-operator/prometheus-operator#8323) - \[BUGFIX] Reconcile `ScrapeConfig` resources when namespace selection changes. [#8334](prometheus-operator/prometheus-operator#8334) --- ##### [\`v0.88.1\`](https://github.com/prometheus-operator/prometheus-operator/releases/tag/v0.88.1) - \[BUGFIX] Validate `webhookURL` secret for `MSTeams` receiver in `AlertmanagerConfig` CRD. [#8294](prometheus-operator/prometheus-operator#8294) - \[BUGFIX] Revert maximum version check for `EC2/Lightsail` SD in `ScrapeConfig` CRD. [#8308](prometheus-operator/prometheus-operator#8308) - \[BUGFIX] Relax URL validation in `Slack` receiver in AlertmanagerConfig CRD to support Go templates. [#8299](prometheus-operator/prometheus-operator#8299) [#8331](prometheus-operator/prometheus-operator#8331) - \[BUGFIX] Relax URL validation in `PagerDuty` in AlertmanagerConfig CRD to support Go templates. [#8319](prometheus-operator/prometheus-operator#8319) - \[BUGFIX] Relax URL validation in `WebhookConfig` in AlertmanagerConfig CRD to support Go templates. [#8307](prometheus-operator/prometheus-operator#8307) [#8317](prometheus-operator/prometheus-operator#8317) - \[BUGFIX] Relax URL validation in `RocketChat` receiver in AlertmanagerConfig CRD to support Go templates. [#8318](prometheus-operator/prometheus-operator#8318) - \[BUGFIX] Relax URL validation in `Pushover` receiver in AlertmanagerConfig CRD to support Go templates. [#8307](prometheus-operator/prometheus-operator#8307) [#8316](prometheus-operator/prometheus-operator#8316)
Description
added a nonzero duration function for group_interval and repeat_interval in alertmanagerconfig for incorrect validation
Closes: #6594
Type of change
What type of changes does your code introduce to the Prometheus operator? Put an
xin the box that apply.CHANGE(fix or feature that would cause existing functionality to not work as expected)FEATURE(non-breaking change which adds functionality)BUGFIX(non-breaking change which fixes an issue)ENHANCEMENT(non-breaking change which improves existing functionality)NONE(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)Verification
Please check the Prometheus-Operator testing guidelines for recommendations about automated tests.
Changelog entry
Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.