Skip to content

Add server side TLS certificate expiry tracking and INFO telemetry#2913

Merged
madolson merged 17 commits into
valkey-io:unstablefrom
YiwenZhang12:feature/server_cert
Feb 4, 2026
Merged

Add server side TLS certificate expiry tracking and INFO telemetry#2913
madolson merged 17 commits into
valkey-io:unstablefrom
YiwenZhang12:feature/server_cert

Conversation

@YiwenZhang12

@YiwenZhang12 YiwenZhang12 commented Dec 5, 2025

Copy link
Copy Markdown
Contributor

regarding issue: #2784

  • Parse TLS certificate expiry/serials and refresh cached values at startup and on TLS reconfigure;

  • expose INFO # TLS fields so monitoring can alert on impending expiry.

INFO:

# TLS
tls_server_cert_serial:826F01147AC63276
tls_server_cert_expires_in_seconds:123456
tls_client_cert_serial:826F01147AC63276
tls_client_cert_expires_in_seconds:123456
tls_ca_cert_serial:11A566...
tls_ca_cert_expires_in_seconds:987654

@YiwenZhang12 YiwenZhang12 changed the title Add Server Side ejbgcbbunritikuevujfudbcjujijkdricrjhcbetnrkTLS certificate expiry tracking and INFO telemetry Add Server Side TLS certificate expiry tracking and INFO telemetry Dec 5, 2025
@YiwenZhang12 YiwenZhang12 marked this pull request as draft December 5, 2025 00:34
@YiwenZhang12 YiwenZhang12 changed the title Add Server Side TLS certificate expiry tracking and INFO telemetry Add server side TLS certificate expiry tracking and INFO telemetry Dec 5, 2025
@YiwenZhang12

YiwenZhang12 commented Dec 5, 2025

Copy link
Copy Markdown
Contributor Author

Hi @zuiderkwast , since we discussed this in the issue, I wanted to share the PR here. This PR is still missing tests, before I add them, I wanted to check whether the overall approach looks correct to you. No rush at all, and happy to iterate based on your feedback. Thank you!

@YiwenZhang12 YiwenZhang12 marked this pull request as ready for review December 5, 2025 00:55
@zuiderkwast zuiderkwast added the major-decision-pending Major decision pending by TSC team label Dec 11, 2025
@codecov

codecov Bot commented Dec 11, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 43.24324% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.86%. Comparing base (fd14380) to head (fdee221).
⚠️ Report is 12 commits behind head on unstable.

Files with missing lines Patch % Lines
src/tls.c 0.00% 13 Missing ⚠️
src/server.c 72.72% 6 Missing ⚠️
src/config.c 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2913      +/-   ##
============================================
- Coverage     74.86%   74.86%   -0.01%     
============================================
  Files           129      129              
  Lines         71201    71247      +46     
============================================
+ Hits          53308    53336      +28     
- Misses        17893    17911      +18     
Files with missing lines Coverage Δ
src/server.h 100.00% <ø> (ø)
src/config.c 78.56% <0.00%> (-0.21%) ⬇️
src/server.c 89.35% <72.72%> (-0.14%) ⬇️
src/tls.c 18.75% <0.00%> (-81.25%) ⬇️

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@soloestoy

Copy link
Copy Markdown
Member

good feature

Comment thread src/server.c Outdated
Comment thread src/config.c Outdated
Comment thread src/server.c Outdated
Yiwen Zhang added 2 commits January 11, 2026 22:19
Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
@YiwenZhang12 YiwenZhang12 force-pushed the feature/server_cert branch 2 times, most recently from 0f933ca to d59827b Compare January 12, 2026 06:42
@YiwenZhang12

YiwenZhang12 commented Jan 12, 2026

Copy link
Copy Markdown
Contributor Author

@zuiderkwast Thanks for the detailed feedback! I've made the changes you suggested:

  • Expiry tracking moved into tls.c
  • INFO now reports the actual remaining seconds
  • Cron job just emits the logging. also changed it to run once per day to reduce noise.
  • Added tests to cover the new behavior

Let me know if this looks better!

@YiwenZhang12 YiwenZhang12 force-pushed the feature/server_cert branch 3 times, most recently from 2582512 to 1c234ee Compare January 12, 2026 08:07
@zuiderkwast zuiderkwast linked an issue Jan 12, 2026 that may be closed by this pull request

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly looks good. We don't have a majority of TSC in the meeting, but madolson, zuiderkwast, and Ran conditionally approved it. If no one else comments in two weeks we can mark it major decision approved.

Comment thread src/server.c Outdated
Comment thread src/server.c Outdated
Comment thread src/tls.c Outdated
@YiwenZhang12 YiwenZhang12 force-pushed the feature/server_cert branch 3 times, most recently from a25b612 to f7014a1 Compare January 12, 2026 19:25
@YiwenZhang12

Copy link
Copy Markdown
Contributor Author

Hi @madolson Thanks for the review and the helpful comments!

I removed the redundant tls_enabled flag from INFO tls, so the section now focuses only on certificate metadata. For each loaded TLS cert (server, client, and CA), it includes both serial number and an “expires in seconds” value.

I also added a configurable tls-cert-warning-days (default 7) to control when daily expiry log start. If a cert is already expired, we still log a warning regardless of the threshold. All three certs are included.

example:
INFO

# TLS
tls_server_cert_serial:826F01147AC63276
tls_server_cert_expires_in_seconds:123456
tls_client_cert_serial:826F01147AC63276
tls_client_cert_expires_in_seconds:123456
tls_ca_cert_serial:11A566...
tls_ca_cert_expires_in_seconds:987654

LOG:

[NOTICE] TLS server certificate (serial 826F01147AC63276) expiring in 5 days
[WARNING] TLS CA certificate (serial 11A566...) has EXPIRED

Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
@YiwenZhang12

Copy link
Copy Markdown
Contributor Author

Hi @madolson @zuiderkwast, thanks for the review! I’ve verified the TLS tests locally and everything is passing. It’s been about two weeks since your note, so I wanted to check in and see if you’re comfortable moving forward with these changes.

Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Yiwen Zhang added 2 commits January 29, 2026 10:48
Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds server-side TLS certificate expiry tracking, logging, and INFO telemetry so operators can monitor impending TLS certificate expirations and act before outages occur.

Changes:

  • Track absolute expiration times and serial numbers for server, client, and CA certificates in the core TLS layer, and expose remaining lifetime in a new INFO tls section.
  • Add configuration for pre-expiry logging (tls-cert-expiry-warning-days) plus periodic logging that distinguishes between “expiring soon” (NOTICE) and “already expired” (WARNING).
  • Extend the TLS test certificates and unit tests to cover INFO TLS telemetry, CA bundles/directories, client-cert presence/absence, and state transitions when TLS is disabled and re-enabled.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
valkey.conf Documents the new tls-cert-expiry-warning-days config option and how pre-expiry vs post-expiry logging behaves.
utils/gen-test-certs.sh Extends TLS test cert generation to build a multi-cert CA bundle and hashed CA directory used to validate INFO tls CA counting and earliest-expiry selection.
tests/unit/tls.tcl Adds comprehensive tests for INFO tls behavior (serials, expiry countdown, CA count), behavior when TLS is disabled, CA bundle/directory semantics, and TLS disable/enable transitions.
src/tls.c Implements ASN.1 time parsing and conversion to epoch, tracks expiry times and serials for server/client/CA certificates, aggregates CA bundle/dir info, and adds the tlsLogServerCertExpiry logging path.
src/server.h Extends the valkeyServer struct with TLS certificate expiry/serial fields and declares the new TLS helper functions used by config and cron.
src/server.c Initializes the new TLS expiry state, adds the INFO tls section that exposes serials, seconds-to-expiry, and CA count, and wires tlsLogServerCertExpiry into serverCron.
src/config.c Wires in the tls-cert-expiry-warning-days config and ensures TLS expiry state is reset appropriately when TLS configs or ports change.

Comment thread src/tls.c Outdated
Comment thread src/server.c Outdated
Comment thread src/server.c Outdated
Comment thread src/config.c

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, only high level thing is I'm still not very convinced the log for the warning is a great config, since I think people won't really watch for those and I'm also not clear about needing a metric for number of parsed certificates. Other than that the code mostly looks good, mostly just suggestions for simplication. Looks close to be being merged.

Comment thread tests/unit/tls.tcl Outdated
Comment thread tests/unit/tls.tcl Outdated
Comment thread src/tls.c
Comment thread tests/unit/tls.tcl Outdated
Comment thread src/config.c
Comment thread src/server.c Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c

#else /* USE_OPENSSL */

void tlsResetCertInfo(void) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this almost the exact same as the TLS version?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We already pulled the shared clearing logic into tlsClearAllCertInfo(), so the duplicated code is minimal now. The remaining tlsResetCertInfo() implementations live in different compile paths and have slightly different semantics (TLS build has the guard; non‑TLS always clears).

Yiwen Zhang added 2 commits January 31, 2026 20:07
Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
@YiwenZhang12

Copy link
Copy Markdown
Contributor Author

Thanks for the feedback @madolson . I’ve addressed the comments and pushed updates. could you please take another look when you have a moment?

summary:

  • Removed TLS cert‑expiry warning logging, cron hook, and related configs.
  • Dropped tls_ca_cert_count from INFO/tls state and cleaned up related tests.
  • Simplified TLS cert INFO tests: use getInfoProperty, remove unused client creation, and replaced fixed sleep with polling.
  • Refactored cert info helpers (tlsClearAllCertInfo, tlsRefreshAllCertInfo) and unified TLS/non‑TLS paths.
  • Simplified ASN1 time handling by relying on ASN1_TIME_to_tm and removing old fallback parsing.

@madolson madolson added the needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. label Feb 2, 2026
Comment thread src/server.c Outdated
@madolson madolson added major-decision-approved Major decision approved by TSC team release-notes This issue should get a line item in the release notes and removed major-decision-pending Major decision pending by TSC team labels Feb 2, 2026
Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this. We talked during todays meeting about adding a log when the cert is actually expired, but I'm fine with or without that.

@YiwenZhang12

Copy link
Copy Markdown
Contributor Author

I'm ok with this. We talked during todays meeting about adding a log when the cert is actually expired, but I'm fine with or without that.

Thanks for the approval @madolson, really appreciate it.

I didn’t include the expiry log here because I noticed there are multiple places where it should be handled (startup/reload as well as runtime). To keep things focused, I started a separate PR to address logging more consistently:
#3153

Happy to continue the discussion there.

@madolson madolson merged commit e4804d5 into valkey-io:unstable Feb 4, 2026
24 checks passed
sachinvmurthy pushed a commit to sachinvmurthy/valkey that referenced this pull request Feb 5, 2026
…alkey-io#2913)

- Parse TLS certificate expiry/serials and refresh cached values at
startup and on TLS reconfigure;

- expose INFO # TLS fields so monitoring can alert on impending expiry.

INFO:
```
# TLS
tls_server_cert_serial:826F01147AC63276
tls_server_cert_expires_in_seconds:123456
tls_client_cert_serial:826F01147AC63276
tls_client_cert_expires_in_seconds:123456
tls_ca_cert_serial:11A566...
tls_ca_cert_expires_in_seconds:987654
```

---------

Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Co-authored-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: sachinvmurthy <sachin.murthy97@gmail.com>
sarthakaggarwal97 pushed a commit to sarthakaggarwal97/valkey that referenced this pull request Feb 6, 2026
…alkey-io#2913)

- Parse TLS certificate expiry/serials and refresh cached values at
startup and on TLS reconfigure;

- expose INFO # TLS fields so monitoring can alert on impending expiry.


INFO:
```
# TLS
tls_server_cert_serial:826F01147AC63276
tls_server_cert_expires_in_seconds:123456
tls_client_cert_serial:826F01147AC63276
tls_client_cert_expires_in_seconds:123456
tls_ca_cert_serial:11A566...
tls_ca_cert_expires_in_seconds:987654
```

---------

Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Co-authored-by: Yiwen Zhang <yiwen_zhang@apple.com>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
…alkey-io#2913)

- Parse TLS certificate expiry/serials and refresh cached values at
startup and on TLS reconfigure;

- expose INFO # TLS fields so monitoring can alert on impending expiry.


INFO:
```
# TLS
tls_server_cert_serial:826F01147AC63276
tls_server_cert_expires_in_seconds:123456
tls_client_cert_serial:826F01147AC63276
tls_client_cert_expires_in_seconds:123456
tls_ca_cert_serial:11A566...
tls_ca_cert_expires_in_seconds:987654
```

---------

Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Co-authored-by: Yiwen Zhang <yiwen_zhang@apple.com>
hpatro pushed a commit to hpatro/valkey that referenced this pull request Mar 5, 2026
…alkey-io#2913)

- Parse TLS certificate expiry/serials and refresh cached values at
startup and on TLS reconfigure;

- expose INFO # TLS fields so monitoring can alert on impending expiry.

INFO:
```
# TLS
tls_server_cert_serial:826F01147AC63276
tls_server_cert_expires_in_seconds:123456
tls_client_cert_serial:826F01147AC63276
tls_client_cert_expires_in_seconds:123456
tls_ca_cert_serial:11A566...
tls_ca_cert_expires_in_seconds:987654
```

---------

Signed-off-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Co-authored-by: Yiwen Zhang <yiwen_zhang@apple.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
madolson pushed a commit that referenced this pull request Apr 27, 2026
…#3572)

This reverts commit eb3f465.

As discussed in the TSC Core Meeting, this PR reverts #2999 from Valkey
9.1. We plan to do in next major version as it's a breaking change.

**Note**: Merge conflicts in `src/tls.c` includes were resolved due to
overlapping changes from PR #2913, which added
`tlsUpdateCertInfoFromDir()` with shared header dependencies.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

major-decision-approved Major decision approved by TSC team needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[NEW] Add certificate expiration tracking (in days/hours)

6 participants