feat(metrics): adding certmanager_certificate_challenge_status metric#7736
Conversation
|
@hjoshi123: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @hjoshi123. Thanks for your PR. I'm waiting for a cert-manager member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/kind feature |
|
/ok-to-test |
2812dfc to
501a401
Compare
|
@ThatsMrTalbot there was also talk of adding age of challenge as a label.. I was thinking of a different. metric for that since this metric is only supposed to tell the status.. what's your take on that? |
Age should not be a label, age is a constantly changing value, having it as a label would introduce high cardinality |
Ah yes I didnt think of that.. so how would we solve it if we wanted to? it doesnt have to be this PR I am guessing but in general. |
The way I have seen time metrics work in the past is having a metric that contains a unix timestamp, for example |
|
The code looks good, I want to run it locally, make sure it behaves as the code reads. |
@ThatsMrTalbot looks like I messed up while squashing commits through |
|
Should be no need to create a new PR. You just need som git-fu. 😉 This recipe usually works well for me. Assuming your fork is the origin remote and upstream is the original cert-manager remote. git fetch --all Now check that all eventual conflicts are resolved and the changeset in your working area looks good. Then just commit and force push. git commit -am "your-commit-message" Before the last step, ensure your commit looks good. Good luck! 👍 |
d159660 to
bc2bbe3
Compare
381b889 to
35841d5
Compare
35841d5 to
f467393
Compare
|
@hjoshi123: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Thank you @erikgb @ThatsMrTalbot for the git-fu suggestions. The PR should be ready to merge now. Let me know if there are some final changes needed. |
| func TestMetricsController(t *testing.T) { | ||
| config, stopFn := framework.RunControlPlane(t) | ||
| t.Cleanup(stopFn) | ||
| defer stopFn() |
There was a problem hiding this comment.
Why has t.Cleanup been changed to defer?
There was a problem hiding this comment.
Oh good point.. I think this was one of the merge errors I had and forgot to clean that up (because my base branch was before the master had this code I guess). Will quickly amend it now
f467393 to
002e2bd
Compare
| } | ||
| }() | ||
| defer func() { | ||
| shutdownCtx, cancel := context.WithTimeout(context.WithoutCancel(t.Context()), time.Second*5) |
There was a problem hiding this comment.
Another thing that looks to have changed - was this intentional or part of the merge fun?
There was a problem hiding this comment.
oh missed this.. yup seems to be part of the merge fun.. fixing it.. sorry for the minor mistakes 😅
Signed-off-by: hjoshi123 <hemant.joshi@vizio.com>
002e2bd to
2558e46
Compare
|
Thanks for your persistence on this! /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ThatsMrTalbot The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Thank you @ThatsMrTalbot for helping me out and briefing me on things. Also thank you @erikgb |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [cert-manager](https://cert-manager.io) ([source](https://github.com/cert-manager/cert-manager)) | minor | `v1.18.2` -> `v1.19.0` | --- ### Release Notes <details> <summary>cert-manager/cert-manager (cert-manager)</summary> ### [`v1.19.0`](https://github.com/cert-manager/cert-manager/releases/tag/v1.19.0) [Compare Source](cert-manager/cert-manager@v1.18.2...v1.19.0) cert-manager is the easiest way to automatically manage certificates in Kubernetes and OpenShift clusters. This release focuses on expanding platform compatibility, improving deployment flexibility, enhancing observability, and addressing key reliability issues. > 📖 Read the full release notes at cert-manager.io: <https://cert-manager.io/docs/releases/release-notes/release-notes-1.19> Changes since `v1.18.0`: #### Feature - Add IPv6 rules to the default network policy ([#​7726](cert-manager/cert-manager#7726), [@​jcpunk](https://github.com/jcpunk)) - Add `global.nodeSelector` to helm chart to allow for a single `nodeSelector` to be set across all services. ([#​7818](cert-manager/cert-manager#7818), [@​StingRayZA](https://github.com/StingRayZA)) - Add a feature gate to default to Ingress `pathType` `Exact` in ACME HTTP01 Ingress challenge solvers. ([#​7795](cert-manager/cert-manager#7795), [@​sspreitzer](https://github.com/sspreitzer)) - Add generated `applyconfigurations` allowing clients to make type-safe server-side apply requests for cert-manager resources. ([#​7866](cert-manager/cert-manager#7866), [@​erikgb](https://github.com/erikgb)) - Added API defaults to issuer references group (cert-manager.io) and kind (Issuer). ([#​7414](cert-manager/cert-manager#7414), [@​erikgb](https://github.com/erikgb)) - Added `certmanager_certificate_challenge_status` Prometheus metric. ([#​7736](cert-manager/cert-manager#7736), [@​hjoshi123](https://github.com/hjoshi123)) - Added `protocol` field for `rfc2136` DNS01 provider ([#​7881](cert-manager/cert-manager#7881), [@​hjoshi123](https://github.com/hjoshi123)) - Added experimental field `hostUsers` flag to all pods. Not set by default. ([#​7973](cert-manager/cert-manager#7973), [@​hjoshi123](https://github.com/hjoshi123)) - Support configurable resource requests and limits for ACME HTTP01 solver pods through ClusterIssuer and Issuer specifications, allowing granular resource management that overrides global `--acme-http01-solver-resource-*` settings. ([#​7972](cert-manager/cert-manager#7972), [@​lunarwhite](https://github.com/lunarwhite)) - The `CAInjectorMerging` feature has been promoted to BETA and is now enabled by default ([#​8017](cert-manager/cert-manager#8017), [@​ThatsMrTalbot](https://github.com/ThatsMrTalbot)) - The controller, webhook and ca-injector now log their version and git commit on startup for easier debugging and support. ([#​8072](cert-manager/cert-manager#8072), [@​prasad89](https://github.com/prasad89)) - Updated `certificate` metrics to the collector approach. ([#​7856](cert-manager/cert-manager#7856), [@​hjoshi123](https://github.com/hjoshi123)) #### Bug or Regression - ACME: Increased challenge authorization timeout to 2 minutes to fix `error waiting for authorization` ([#​7796](cert-manager/cert-manager#7796), [@​hjoshi123](https://github.com/hjoshi123)) - BUGFIX: permitted URI domains were incorrectly used to set the excluded URI domains in the CSR's name constraints ([#​7816](cert-manager/cert-manager#7816), [@​kinolaev](https://github.com/kinolaev)) - Enforced ACME HTTP-01 solver validation to properly reject configurations when multiple ingress options (`class`, `ingressClassName`, `name`) are specified simultaneously ([#​8021](cert-manager/cert-manager#8021), [@​lunarwhite](https://github.com/lunarwhite)) - Increase maximum sizes of PEM certificates and chains which can be parsed in cert-manager, to handle leaf certificates with large numbers of DNS names or other identities ([#​7961](cert-manager/cert-manager#7961), [@​SgtCoDFish](https://github.com/SgtCoDFish)) - Reverted adding the `global.rbac.disableHTTPChallengesRole` Helm option. ([#​7836](cert-manager/cert-manager#7836), [@​inteon](https://github.com/inteon)) - This change removes the `path` label of core ACME client metrics and will require users to update their monitoring dashboards and alerting rules if using those metrics. ([#​8109](cert-manager/cert-manager#8109), [@​mladen-rusev-cyberark](https://github.com/mladen-rusev-cyberark)) - Use the latest version of `ingress-nginx` in E2E tests to ensure compatibility ([#​7792](cert-manager/cert-manager#7792), [@​wallrj](https://github.com/wallrj)) #### Other (Cleanup or Flake) - Helm: Fix naming template of `tokenrequest` RoleBinding resource to improve consistency ([#​7761](cert-manager/cert-manager#7761), [@​lunarwhite](https://github.com/lunarwhite)) - Improve error messages when certificates, CRLs or private keys fail admission due to malformed or missing PEM data ([#​7928](cert-manager/cert-manager#7928), [@​SgtCoDFish](https://github.com/SgtCoDFish)) - Major upgrade of Akamai SDK. NOTE: The new version has not been fully tested end-to-end due to the lack of cloud infrastructure. ([#​8003](cert-manager/cert-manager#8003), [@​hjoshi123](https://github.com/hjoshi123)) - Update kind images to include the Kubernetes 1.33 node image ([#​7786](cert-manager/cert-manager#7786), [@​wallrj](https://github.com/wallrj)) - Use `maps.Copy` for cleaner map handling ([#​8092](cert-manager/cert-manager#8092), [@​quantpoet](https://github.com/quantpoet)) - Vault: Migrate Vault E2E add-on tests from deprecated `vault-client-go` to the new `vault/api` client. ([#​8059](cert-manager/cert-manager#8059), [@​armagankaratosun](https://github.com/armagankaratosun)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xMzUuNCIsInVwZGF0ZWRJblZlciI6IjQxLjEzNS40IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJjaGFydCJdfQ==--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/1711 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
|
@hjoshi123 We have released this. Please test and feedback: https://github.com/cert-manager/cert-manager/releases/tag/v1.19.1 |
Pull Request Motivation
feature
Fixes #7700. This PR adds the new challenge status metric with a controller which monitors the challenges.
/kind
Release Note