Skip to content

Increase maximum PEM sizes for leaf certificates and chains#7961

Merged
cert-manager-prow[bot] merged 1 commit intocert-manager:masterfrom
SgtCoDFish:pem-limit-increase
Aug 15, 2025
Merged

Increase maximum PEM sizes for leaf certificates and chains#7961
cert-manager-prow[bot] merged 1 commit intocert-manager:masterfrom
SgtCoDFish:pem-limit-increase

Conversation

@SgtCoDFish
Copy link
Copy Markdown
Member

@SgtCoDFish SgtCoDFish commented Aug 14, 2025

We've seen in #7506 that the PEM size maxima we have can be too restrictive for some users who have large numbers of DNS identities in their leaf certificates, preventing them from using cert-manager for those certificates (or stopping them from upgrading).

This change adds a large estimate for extra data which might be present in a leaf certificate, based on an experimentally observed increase in the size of a certificate request object by adding new DNS names.

We assume 250 large DNS names, totalling about 30kB of size.

We shouldn't expect to see a performance change in our existing benchmark with these changes, because the maxBundleSize is used for those tests and that size remains larger than the new maxCertificateChainSize. That seems to be the case from testing:

# master at time of writing
$ go test -bench=. ./internal/pem/...
goos: darwin
goarch: arm64
pkg: github.com/cert-manager/cert-manager/internal/pem
cpu: Apple M3 Max
BenchmarkPathologicalInput-16                 52          22606764 ns/op
PASS
ok      github.com/cert-manager/cert-manager/internal/pem       2.485s

# this branch
$ go test -bench=. ./internal/pem/...
goos: darwin
goarch: arm64
pkg: github.com/cert-manager/cert-manager/internal/pem
cpu: Apple M3 Max
BenchmarkPathologicalInput-16                 52          22455893 ns/op
PASS
ok      github.com/cert-manager/cert-manager/internal/pem       2.408s

Kind

/kind bug

Release Note

Increase maximum sizes of PEM certificates and chains which can be parsed in cert-manager, to handle leaf certificates with large numbers of DNS names or other identities

We've seen in cert-manager#7506 that the PEM size maxima we have can be too
restrictive for some users who have large numbers of DNS identities in
their leaf certificates, preventing them from using cert-manager for
those certificates (or stopping them from upgrading).

This change adds a large estimate for extra data which might be present
in a leaf certificate, based on an experimentally observed increase in
the size of a certificate request object by adding new DNS names.

We assume 250 large DNS names, totalling about 30kB of size.

We shouldn't expect to see a performance change in our existing
benchmark with these changes, because the maxBundleSize is used for
those tests and that size remains larger than the new
maxCertificateChainSize. That seems to be the case from testing:

```console
$ go test -bench=. ./internal/pem/...
goos: darwin
goarch: arm64
pkg: github.com/cert-manager/cert-manager/internal/pem
cpu: Apple M3 Max
BenchmarkPathologicalInput-16                 52          22606764 ns/op
PASS
ok      github.com/cert-manager/cert-manager/internal/pem       2.485s

$ go test -bench=. ./internal/pem/...
goos: darwin
goarch: arm64
pkg: github.com/cert-manager/cert-manager/internal/pem
cpu: Apple M3 Max
BenchmarkPathologicalInput-16                 52          22455893 ns/op
PASS
ok      github.com/cert-manager/cert-manager/internal/pem       2.408s
```

Signed-off-by: Ashley Davis <ashley.davis@cyberark.com>
@cert-manager-prow cert-manager-prow bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 14, 2025
@SgtCoDFish
Copy link
Copy Markdown
Member Author

By way of testing, I created a chain in cert-manager using 4096-bit RSA keys exclusively and with a leaf certificate with 350 identities in (attached).

First I tested with 200 identities, and the size of the whole chain (2x CA certs, 1x leaf cert) was 16320B (16.3kB).

The size of the full chain in PEM format with 350 identities was 24.4kB:

$ kubectl get -n sandbox secrets myleaf -ojson | jq -r '.data."tls.crt"' | base64 -d | wc -c
   24445

The leaf certificate on its own was 20757B (21kB) and the two CA certs in the chain together were 3688B (3.7kB).

This PR currently raises the size limit for the whole chain to 30000B + 10 * 6500B = 95000B - so there's plenty of headroom here. Our "pathological" test shows no terrible slowdown at this size.

ca_bootstrap_large_chain_350.yaml.tar.gz

@SgtCoDFish
Copy link
Copy Markdown
Member Author

/cherry-pick release-1.17

@cert-manager-bot
Copy link
Copy Markdown
Contributor

@SgtCoDFish: once the present PR merges, I will cherry-pick it on top of release-1.17 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@SgtCoDFish
Copy link
Copy Markdown
Member Author

/cherry-pick release-1.18

@cert-manager-bot
Copy link
Copy Markdown
Contributor

@SgtCoDFish: once the present PR merges, I will cherry-pick it on top of release-1.18 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

// testing the worst case with pathologicalFuzzFile. This guards against future changes making these tests invalid;
// e.g. if, maxCertificateChainSize actually became the largest we accept, we'd want to test against that instead.
if largestLimit < maxPrivateKeyPEMSize || largestLimit < maxCertificateChainSize {
panic(fmt.Errorf("invalid test: expected max cert bundle size %d to be larger than maxPrivateKeyPEMSize %d and maxCertificateChainSize %d", maxBundleSize, maxPrivateKeyPEMSize, maxCertificateChainSize))
Copy link
Copy Markdown
Member

@erikgb erikgb Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that this was not changed in this PR, but what does the term "invalid test" refer to in this case? Seems like a code assert to me.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid test means that the assumption the test is built on is no longer valid - so, the test is meant to try decoding using the largest max size we have available, to see how the worst possible SafeDecode* function performs. Before and after this PR, the worst possible is trust bundles.

This check guards against the max chain size growing larger than the max trust bundle size due to an unrelated change. In that case we'd want to make the test against the max chain size instead of the max bundle size.

largestLimit is used for the benchmark tests, which can be used to check for regressions in the fix we made for PEM decoding.

Copy link
Copy Markdown
Member

@erikgb erikgb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/hold

Adding a hold in case you want to rephrase the validation test. Thanks!

@cert-manager-prow cert-manager-prow bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Aug 15, 2025
@cert-manager-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: erikgb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cert-manager-prow cert-manager-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 15, 2025
@SgtCoDFish
Copy link
Copy Markdown
Member Author

/unhold

Thank you for the review! I think I could probably reword this stuff better - these files are very wordy - but I think for now this is an improvement and I'm going to get stuck into the cherry picks.

@cert-manager-prow cert-manager-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 15, 2025
@cert-manager-prow cert-manager-prow bot merged commit 2662f58 into cert-manager:master Aug 15, 2025
6 checks passed
@cert-manager-bot
Copy link
Copy Markdown
Contributor

@SgtCoDFish: #7961 failed to apply on top of branch "release-1.17":

Applying: increase maximum PEM sizes for leaf certificates and chains
Using index info to reconstruct a base tree...
M	internal/pem/decode.go
M	internal/pem/decode_test.go
M	pkg/util/pki/parse_certificate_chain_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/util/pki/parse_certificate_chain_test.go
Auto-merging internal/pem/decode_test.go
Auto-merging internal/pem/decode.go
CONFLICT (content): Merge conflict in internal/pem/decode.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 increase maximum PEM sizes for leaf certificates and chains

Details

In response to this:

/cherry-pick release-1.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@cert-manager-bot
Copy link
Copy Markdown
Contributor

@SgtCoDFish: new pull request created: #7966

Details

In response to this:

/cherry-pick release-1.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@SgtCoDFish SgtCoDFish deleted the pem-limit-increase branch August 15, 2025 14:10
alexlebens pushed a commit to alexlebens/infrastructure that referenced this pull request Oct 8, 2025
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [cert-manager](https://cert-manager.io) ([source](https://github.com/cert-manager/cert-manager)) | minor | `v1.18.2` -> `v1.19.0` |

---

### Release Notes

<details>
<summary>cert-manager/cert-manager (cert-manager)</summary>

### [`v1.19.0`](https://github.com/cert-manager/cert-manager/releases/tag/v1.19.0)

[Compare Source](cert-manager/cert-manager@v1.18.2...v1.19.0)

cert-manager is the easiest way to automatically manage certificates in Kubernetes and OpenShift clusters.

This release focuses on expanding platform compatibility, improving deployment flexibility, enhancing observability, and addressing key reliability issues.

> 📖  Read the full release notes at cert-manager.io: <https://cert-manager.io/docs/releases/release-notes/release-notes-1.19>

Changes since `v1.18.0`:

#### Feature

- Add IPv6 rules to the default network policy ([#&#8203;7726](cert-manager/cert-manager#7726), [@&#8203;jcpunk](https://github.com/jcpunk))
- Add `global.nodeSelector` to helm chart to allow for a single `nodeSelector` to be set across all services. ([#&#8203;7818](cert-manager/cert-manager#7818), [@&#8203;StingRayZA](https://github.com/StingRayZA))
- Add a feature gate to default to Ingress `pathType` `Exact` in ACME HTTP01 Ingress challenge solvers. ([#&#8203;7795](cert-manager/cert-manager#7795), [@&#8203;sspreitzer](https://github.com/sspreitzer))
- Add generated `applyconfigurations` allowing clients to make type-safe server-side apply requests for cert-manager resources. ([#&#8203;7866](cert-manager/cert-manager#7866), [@&#8203;erikgb](https://github.com/erikgb))
- Added API defaults to issuer references group (cert-manager.io) and kind (Issuer). ([#&#8203;7414](cert-manager/cert-manager#7414), [@&#8203;erikgb](https://github.com/erikgb))
- Added `certmanager_certificate_challenge_status` Prometheus metric. ([#&#8203;7736](cert-manager/cert-manager#7736), [@&#8203;hjoshi123](https://github.com/hjoshi123))
- Added `protocol` field for `rfc2136` DNS01 provider ([#&#8203;7881](cert-manager/cert-manager#7881), [@&#8203;hjoshi123](https://github.com/hjoshi123))
- Added experimental field `hostUsers` flag to all pods. Not set by default. ([#&#8203;7973](cert-manager/cert-manager#7973), [@&#8203;hjoshi123](https://github.com/hjoshi123))
- Support configurable resource requests and limits for ACME HTTP01 solver pods through ClusterIssuer and Issuer specifications, allowing granular resource management that overrides global `--acme-http01-solver-resource-*` settings. ([#&#8203;7972](cert-manager/cert-manager#7972), [@&#8203;lunarwhite](https://github.com/lunarwhite))
- The `CAInjectorMerging` feature has been promoted to BETA and is now enabled by default ([#&#8203;8017](cert-manager/cert-manager#8017), [@&#8203;ThatsMrTalbot](https://github.com/ThatsMrTalbot))
- The controller, webhook and ca-injector now log their version and git commit on startup for easier debugging and support. ([#&#8203;8072](cert-manager/cert-manager#8072), [@&#8203;prasad89](https://github.com/prasad89))
- Updated `certificate` metrics to the collector approach. ([#&#8203;7856](cert-manager/cert-manager#7856), [@&#8203;hjoshi123](https://github.com/hjoshi123))

#### Bug or Regression

- ACME: Increased challenge authorization timeout to 2 minutes to fix `error waiting for authorization` ([#&#8203;7796](cert-manager/cert-manager#7796), [@&#8203;hjoshi123](https://github.com/hjoshi123))
- BUGFIX: permitted URI domains were incorrectly used to set the excluded URI domains in the CSR's name constraints ([#&#8203;7816](cert-manager/cert-manager#7816), [@&#8203;kinolaev](https://github.com/kinolaev))
- Enforced ACME HTTP-01 solver validation to properly reject configurations when multiple ingress options (`class`, `ingressClassName`, `name`) are specified simultaneously ([#&#8203;8021](cert-manager/cert-manager#8021), [@&#8203;lunarwhite](https://github.com/lunarwhite))
- Increase maximum sizes of PEM certificates and chains which can be parsed in cert-manager, to handle leaf certificates with large numbers of DNS names or other identities ([#&#8203;7961](cert-manager/cert-manager#7961), [@&#8203;SgtCoDFish](https://github.com/SgtCoDFish))
- Reverted adding the `global.rbac.disableHTTPChallengesRole` Helm option. ([#&#8203;7836](cert-manager/cert-manager#7836), [@&#8203;inteon](https://github.com/inteon))
- This change removes the `path` label of core ACME client metrics and will require users to update their monitoring dashboards and alerting rules if using those metrics. ([#&#8203;8109](cert-manager/cert-manager#8109), [@&#8203;mladen-rusev-cyberark](https://github.com/mladen-rusev-cyberark))
- Use the latest version of `ingress-nginx` in E2E tests to ensure compatibility ([#&#8203;7792](cert-manager/cert-manager#7792), [@&#8203;wallrj](https://github.com/wallrj))

#### Other (Cleanup or Flake)

- Helm: Fix naming template of `tokenrequest` RoleBinding resource to improve consistency ([#&#8203;7761](cert-manager/cert-manager#7761), [@&#8203;lunarwhite](https://github.com/lunarwhite))
- Improve error messages when certificates, CRLs or private keys fail admission due to malformed or missing PEM data ([#&#8203;7928](cert-manager/cert-manager#7928), [@&#8203;SgtCoDFish](https://github.com/SgtCoDFish))
- Major upgrade of Akamai SDK. NOTE: The new version has not been fully tested end-to-end due to the lack of cloud infrastructure. ([#&#8203;8003](cert-manager/cert-manager#8003), [@&#8203;hjoshi123](https://github.com/hjoshi123))
- Update kind images to include the Kubernetes 1.33 node image ([#&#8203;7786](cert-manager/cert-manager#7786), [@&#8203;wallrj](https://github.com/wallrj))
- Use `maps.Copy` for cleaner map handling ([#&#8203;8092](cert-manager/cert-manager#8092), [@&#8203;quantpoet](https://github.com/quantpoet))
- Vault: Migrate Vault E2E add-on tests from deprecated `vault-client-go` to the new `vault/api` client. ([#&#8203;8059](cert-manager/cert-manager#8059), [@&#8203;armagankaratosun](https://github.com/armagankaratosun))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xMzUuNCIsInVwZGF0ZWRJblZlciI6IjQxLjEzNS40IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJjaGFydCJdfQ==-->

Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/1711
Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net>
Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
@wallrj-cyberark
Copy link
Copy Markdown
Member

@SgtCoDFish We have released this. Please test and feedback: https://github.com/cert-manager/cert-manager/releases/tag/v1.19.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants