Skip to content

Design: reduce cert-manager controller's memory consumption#5639

Merged
jetstack-bot merged 3 commits intocert-manager:masterfrom
irbekrm:memory_design
Feb 20, 2023
Merged

Design: reduce cert-manager controller's memory consumption#5639
jetstack-bot merged 3 commits intocert-manager:masterfrom
irbekrm:memory_design

Conversation

@irbekrm
Copy link
Copy Markdown
Contributor

@irbekrm irbekrm commented Dec 14, 2022

Related to #5220

A design to limit cert-manager controller's memory consumption.

@jetstack-bot
Copy link
Copy Markdown
Contributor

@irbekrm: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jetstack-bot jetstack-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/design Categorizes issue or PR as related to design. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Dec 14, 2022
@irbekrm irbekrm force-pushed the memory_design branch 6 times, most recently from 23dd0f8 to 3e44410 Compare December 16, 2022 10:28
@irbekrm irbekrm changed the title WIP: initial design for reducing memory consumption WIP: reducing controller's memory consumption Dec 16, 2022
@jetstack-bot jetstack-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 16, 2022

See an implementation for cainjector in https://github.com/cert-manager/cert-manager/pull/5174 thanks to @aubm for working on this.

#### Comparison
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add another parameter to this comparison: Average certificate issuance time. It will help us to quantify the increase in certificate issuance time(if any).

Another bonus point would be to plot the same metric for different issuer types.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sathyanarays that's a good idea. I've added a rough comparison of issuance times https://github.com/irbekrm/cert-manager/blob/2ddd8c2413a26a6108093ba24c42615fcbd11476/design/20221205-memory-management.md#issuance-of-a-large-number-of-certificates.

What will make the difference will be whether the users have labelled any Secrets referenced by the issuer, the issuer type should not really matter.

It'd be interesting to hear from some users with larger deployments/shorter lived certs how big an increase in issuance time is 'too big'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will make the difference will be whether the users have labelled any Secrets referenced by the issuer

Is this being handled by the PR 5560?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the script used. Polling is being done to check if all the certificates are issued: https://gist.github.com/irbekrm/bc56a917a164b1a3a097bda483def0b8#file-measure-issuance-time-sh-L76-L98. Will that not affect the performance parameters? can we confirm this by replacing sleep 1 with sleep 60?

Copy link
Copy Markdown
Contributor Author

@irbekrm irbekrm Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being handled by the PR #5660?

No, we will not apply labels to user created issuer Secrets. We had a brief discussion about this - mainly the opinion is that it would be wrong to start applying cert-manager labels to secrets with i.e Vault credentials that may have been created by user's GitOps systems or other automation that might also remove the new labels

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the script used. Polling is being done to check if all the certificates are issued: https://gist.github.com/irbekrm/bc56a917a164b1a3a097bda483def0b8#file-measure-issuance-time-sh-L76-L98. Will that not affect the performance parameters? can we confirm this by replacing sleep 1 with sleep 60?

That's a good point, but in general I would like folks to not look at those measurements as the 'average issuance time' because that would hugely vary depending on other factors, I ran the tests on a local kind cluster, it might be different on managed/'busy' Kubernetes etc. The script should only be used to get a rough estimate of a diff when run against different cert-manager versions, so shouldn't matter whether the apiserver is slowed down a bit by some additional requests?

@irbekrm irbekrm force-pushed the memory_design branch 2 times, most recently from 75530bb to 8df2473 Compare December 20, 2022 13:56
@jetstack-bot jetstack-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 20, 2022
Comment on lines +237 to +501
The downside to having metadata only in cache is that if the reconcile loop needs the whole object, it needs to make another call to the kube apiserver to get the actual object. We have a number of reconcile loops that retrieve and parse secret data numerous times, for example [readiness controller](https://github.com/cert-manager/cert-manager/blob/v1.10.1/pkg/controller/certificates/readiness/readiness_controller.go) retrieves and parses `spec.SecretName` secret for a `Certificate` on any event associated with the `Certificate`, any of its `CertificateRequest`s or the `spec.secretName` secret.
TODO: add which projects have adopted metadata-only watches, especially with client-go informers
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the consequences of "retrieving and parsing secret data numerous times"?

With API Priority and Fairness enabled by default in modern Kubernetes, there is no longer a risk of cert-manager overwhelming the K8S API server.

In the meeting today you explained that there is a risk that making too many requests will result in client side throttling in the K8S client, which may cause slow issuance of certificates, but what if we increased the client side throttling defaults or even Disable client-side rate-limiting when AP&F is enabled?

In any case, I do think it would be writing a paragraph about why you think it is a problem to "retrieve and parse secret data numerous times" and if possible add some evidence to show what happens when cert-manager makes too many requests per second.
Random thought: Would it be possible to compile a version of cert-manager without any caching, where the listers in the various controllers make API requests rather than cache lookups? And see how it behaves

Refs:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the consequences of "retrieving and parsing secret data numerous times"?

Additional calls to kube apiserver because of the large number of additional requests. I think the number of calls in this case is too large and there also isn't much of a gain from not caching certificate secrets that usually aren't that big anyway.
In my opinion it does not matter if the additional calls result in client-side throttling or AP&F throttling or even if users can somehow configure it in such a way that cert-manager is 'allowed' to make more requests.
These calls aren't free for users in any case - either it's a slowdown of issuance or less availability for other cluster tenants and there isn't much of a gain memory-wise because the certificate secrets aren't that big so caching them isn't that expensive so I think we should not stop caching secrets that we know we need.

Random thought: Would it be possible to compile a version of cert-manager without any caching, where the listers in the various controllers make API requests rather than cache lookups? And see how it behaves

I've modified the prototype for this design to do that https://github.com/irbekrm/cert-manager/tree/just_partial and added a section that describes this approach to 'Alternatives' https://github.com/irbekrm/cert-manager/blob/memory_design/design/20221205-memory-management.md#use-partialmetadata-only

Comment on lines +99 to +100
[Here](https://github.com/irbekrm/cert-manager/tree/partial_metadata) is an experimental 'partial_metadata' branch that implements a prototype:
[Two informer factories](https://github.com/irbekrm/cert-manager/blob/partial_metadata/pkg/controller/context.go#L293-L297) have been set up for this controller, one a partial metadata one with a selector to only watch unlabelled resources, one to watch Secret types with a label filter.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design details don't give enough information about how the two informers will be used.
Paste the code from your prototype here so that we can comment on it directly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code of SecretGetter I mean.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Richard, that's a good idea.

I added implementation section here, let me know if that makes it more clear https://github.com/irbekrm/cert-manager/blob/memory_design/design/20221205-memory-management.md#implementation

@jetstack-bot jetstack-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 4, 2023
@irbekrm irbekrm force-pushed the memory_design branch 5 times, most recently from b974ebb to 4ae2c2b Compare January 10, 2023 18:21
Signed-off-by: irbekrm <irbekrm@gmail.com>
@irbekrm irbekrm changed the title WIP: reducing controller's memory consumption reducing controller's memory consumption Jan 11, 2023
@jetstack-bot jetstack-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 11, 2023
@irbekrm irbekrm changed the title reducing controller's memory consumption Design: reduce cert-manager controller's memory consumption Jan 11, 2023

### Cons

- All cluster `Secret`s are still listed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am a bit concerned that this exposes a mechanism for end-users of cert-manager's APIs to intentionally or unintentionally cause cert-manager's controller to create an excess number of requests to the kubernetes apiserver (which could either degrade apiserver performance, or act as an effective DoS for other clients as our clientset could get ratelimited, either client-side, SSA or any other means).

One case I could see where this is unintentional is say, a flapping GitOps system.

If I am running the controller as a managed offering for many tenants, how can I protect myself and other users from this?

Copy link
Copy Markdown
Contributor Author

@irbekrm irbekrm Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at this yeah https://github.com/cert-manager/cert-manager/blob/65be2caaaeadbfb8dff63c1afeb01522cff52805/design/20221205-memory-management.md#issuance-of-a-large-number-of-certificates here are the metrics

In practice it would be one additional GET call per issuance for issuers that are set up with some secret and one additional per issuer setup.
In regards to unintentionally overwhelming the API server, I think the metrics show that the additional number of calls and issuance time increase is acceptable. Users could then improve it by labelling issuer secrets. And given that this would start as an alpha feature we should be able to gather feedback from larger setups before this becomes enabled by default.
In regards to intentionally overwhelming, again I think one GET call per issuance and per issuer setup does not make this particular feature that much more possible to use cert-manager for this? I don't know if there is much more we can do in general to prevent this - as I understand the way to prevent this kind of problem is API Priority and Fairness server-side

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added overwhelming kube apiserver to risks and mitigations section https://github.com/cert-manager/cert-manager/blob/47dcce7c70f2e33be78f5e9d56d22ea887b60370/design/20221205-memory-management.md#risks-and-mitigations
I've also read through the issuers code to check how often the secrets get retrieved and updated the issuer secrets section https://github.com/cert-manager/cert-manager/blob/47c3c4c5f4e5effbb32a5f156efe2140fa67961a/design/20221205-memory-management.md#secrets-for-clusterissuers

One case I could see where this is unintentional is say, a flapping GitOps system.

The system then would need to make actual changes to either issuer config or issuer secrets or certificates for this to happen. With such changes to certificates (no AP&F and cert-manager configured with very high QPS and burst) we could already overwhelm apiserver today (as we'd be issuing in a tight loop). I think this is unavoidable and would be up to users to limit who can modify cert-manager installation and to have AP&F

Signed-off-by: irbekrm <irbekrm@gmail.com>
Signed-off-by: irbekrm <irbekrm@gmail.com>
@irbekrm
Copy link
Copy Markdown
Contributor Author

irbekrm commented Feb 10, 2023

/milestone Next

@jetstack-bot
Copy link
Copy Markdown
Contributor

@irbekrm: The provided milestone is not valid for this repository. Milestones in this repository: [v1.11, v1.12]

Use /milestone clear to clear the milestone.

Details

In response to this:

/milestone Next

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@irbekrm
Copy link
Copy Markdown
Contributor Author

irbekrm commented Feb 10, 2023

/milestone v1.12

@jetstack-bot jetstack-bot added this to the v1.12 milestone Feb 10, 2023
Copy link
Copy Markdown
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing all the measurements and prototypes @irbekrm and for measuring the performance of each.
I think the chosen approach makes sense and I can now understand how it will significantly reduce the size of the cache while minimizing extra calls to the Kubernetes API server.

Will we also use the transform technique to remove large annotations such as last-applied-configuration ?
(and incidentally, does last-applied-configuration also include the data in the context of a Secret or ConfigMap?).

The "Use a standalone typed cache populated from different sources" idea sounds quite interesting to me, because I started imagining how it might allow cert-manager to be deployed without having read / list permission to every Secret in the cluster. It could be deployed with read permission in only certain namespaces and we could start up a namespace constrained lister watcher in each of those permitted namespaces and feed the Secrets (or other resources) into the standalone cache, which would contain all the resources that cert-manager is permitted to read....but that should not delay this effort.

/lgtm
/hold in case you want more feedback on the design.

@jetstack-bot jetstack-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 14, 2023
@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Feb 14, 2023
@jetstack-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: irbekrm, wallrj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@irbekrm
Copy link
Copy Markdown
Contributor Author

irbekrm commented Feb 14, 2023

Yes I think that we definitely should. The prototype implementation uses this cleanup function as a transform i.e here

(and incidentally, does last-applied-configuration also include the data in the context of a Secret or ConfigMap?)

my understanding (and what I could observe) is that, yes, it does

The "Use a standalone typed cache populated from different sources" idea sounds quite interesting to me, because I started imagining how it might allow cert-manager to be deployed without having read / list permission to every Secret in the cluster

That could be interesting to explore, yes. c/r already has multi-namespace cache. Perhaps that could be a reason for us to adopt c/r in controller? It would be interesting to see if it was possible to have a cache that can watch partial metadata for a specific type and when a specific resource of that type is retrieved via GET in a control loop, the GET goes via cache (like c/r caching client) and the caching mechanism knows that for this particular resource (filter by name & namespace) it should now exclude it from partial metadata watch list and instead start watching it via a typed watch to retreive the whole thing.

Thanks for the review Richard. I will leave this open for a day or so more to give folks a chance to comment and will then merge.

@irbekrm
Copy link
Copy Markdown
Contributor Author

irbekrm commented Feb 20, 2023

Thanks for the review Richard. I will leave this open for a day or so more to give folks a chance to comment and will then merge.

Will merge now as a few days have passed.

/hold cancel

@jetstack-bot jetstack-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 20, 2023
@jetstack-bot jetstack-bot merged commit 2f2f620 into cert-manager:master Feb 20, 2023
## Design details
### Implementation

Ensure that `certificate.Spec.SecretName` `Secret` as well as the `Secret` with temporary private key are labelled with a `controller.cert-manager.io/fao: true` label.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does fao stand for?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'for attention of' it comes from Richard's idea here that I quite liked and couldn't come up with a better name myself #5660 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. kind/design Categorizes issue or PR as related to design. lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants