Skip to content

Automatic TLS reload#3020

Merged
madolson merged 7 commits into
valkey-io:unstablefrom
yang-z-o:tls_cert_auto_reload
Jan 21, 2026
Merged

Automatic TLS reload#3020
madolson merged 7 commits into
valkey-io:unstablefrom
yang-z-o:tls_cert_auto_reload

Conversation

@yang-z-o

@yang-z-o yang-z-o commented Jan 7, 2026

Copy link
Copy Markdown
Contributor

Overview

This PR adds support for automatic background TLS reloading, closes #2649
TLS validity checks and fail-fast behavior on invalid certificates are handled separately in #2999.

  • New configuration
    • tls-auto-reload-interval <seconds>
    • 0 disabled (default, backward compatible)
    • >0 check interval in seconds
  • TLS materials change detection in background
    • SHA-256 fingerprint checking for certificate files
    • inode + mtime checking for CA certificate directories and key files
    • Skips reload if materials haven't changed tlsCheckMaterialsAndUpdateCache
  • TLS contexts reload
    • CPU-intensive certificate parsing happens in dedicated BIO worker thread BIO_TLS_RELOAD
    • Main thread never blocks, atomically swaps SSL contexts
    • Two-phase reload: background preparation tlsConfigureAsync + main thread application tlsApplyPendingReload

Note: Original TLS load and reload still remain in main thread using tlsConfigureSync, including:

  • Initial TLS load (server startup)
  • Runtime reload via CONFIG SET

Test results

# Server starts with TLS enabled and accepts connections
43123:M 08 Jan 2026 01:41:01.184 * Ready to accept connections tls
43123:M 08 Jan 2026 01:41:01.184 . Total: 0 clients connected (0 replicas), 1131840 (1.08M) bytes in use
43123:M 08 Jan 2026 01:41:03.196 - Accepted 127.0.0.1:65152

#  Certificate change detected, background reload triggered, reload completed and applied
43123:M 08 Jan 2026 01:41:06.236 * TLS materials changed, triggering background reload
43123:M 08 Jan 2026 01:41:06.236 . Background TLS reconfiguration started
43123:M 08 Jan 2026 01:41:06.237 . Background TLS reload completed successfully
43123:M 08 Jan 2026 01:41:07.249 * TLS materials reloaded successfully

# Connection after reload: Connections continue working after the reload
43123:M 08 Jan 2026 01:41:08.244 - Accepted 127.0.0.1:65156

@dvkashapov dvkashapov added the major-decision-pending Major decision pending by TSC team label Jan 7, 2026
@yang-z-o

yang-z-o commented Jan 8, 2026

Copy link
Copy Markdown
Contributor Author

Hi @madolson @zuiderkwast,
I’ve implemented TLS auto-reload following the approach discussed in #1870 and #2650.
Would appreciate any feedback when you have a chance. Thanks!

@yang-z-o

yang-z-o commented Jan 8, 2026

Copy link
Copy Markdown
Contributor Author

On a side note:
The code-coverage job succeeded in my fork but failed here today because of the benchmark timeout - related issue #2843

@yang-z-o yang-z-o changed the title Automatic TLS certificate reload Automatic TLS reload Jan 8, 2026
@yang-z-o yang-z-o force-pushed the tls_cert_auto_reload branch from 5b53e48 to 55bac7c Compare January 8, 2026 10:04
Signed-off-by: Yang Zhao <zymy701@gmail.com>
@yang-z-o yang-z-o force-pushed the tls_cert_auto_reload branch from 55bac7c to bdadaf0 Compare January 8, 2026 22:55
@codecov

codecov Bot commented Jan 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.27%. Comparing base (1d23866) to head (26bff7c).
⚠️ Report is 23 commits behind head on unstable.

Files with missing lines Patch % Lines
src/bio.c 0.00% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3020      +/-   ##
============================================
+ Coverage     74.26%   74.27%   +0.01%     
============================================
  Files           129      129              
  Lines         70975    71014      +39     
============================================
+ Hits          52710    52747      +37     
- Misses        18265    18267       +2     
Files with missing lines Coverage Δ
src/config.c 78.76% <ø> (ø)
src/server.c 89.48% <ø> (+0.05%) ⬆️
src/server.h 100.00% <ø> (ø)
src/tls.c 100.00% <ø> (ø)
src/bio.c 78.81% <0.00%> (-3.49%) ⬇️

... and 18 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements automatic background TLS certificate reloading to allow seamless certificate rotation without server downtime. The feature adds a new tls-auto-reload-interval configuration option that enables periodic checking and reloading of TLS materials (certificates, keys, and CA files).

Key changes:

  • New tls-auto-reload-interval configuration parameter (default: 0/disabled) to control reload frequency
  • Change detection using SHA-256 fingerprints for certificate files and inode+mtime for directories and key files
  • Background reload architecture using dedicated BIO worker thread (BIO_TLS_RELOAD) to avoid blocking the main thread during certificate parsing

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
valkey.conf Added documentation for the new tls-auto-reload-interval configuration option
src/config.c Registered the new tls-auto-reload-interval configuration parameter
src/tls.c Refactored TLS configuration into sync/async variants, added change detection logic, and implemented background reload mechanism
src/server.h Added function declarations for TLS reload functions with appropriate build guards
src/server.c Integrated TLS auto-reload checks into serverCron
src/bio.h Added BIO_TLS_RELOAD job type and bioCreateTlsReloadJob function declaration
src/bio.c Implemented BIO worker support for TLS reload jobs
tests/unit/tls.tcl Added comprehensive test coverage for auto-reload functionality including change detection, validation, and CA directory handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/tls.tcl Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated
Comment thread tests/unit/tls.tcl Outdated
Comment thread tests/unit/tls.tcl Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, most of it looks really good. Two more significant comments and minor stylistic comments.

Comment thread src/bio.c
Comment thread src/server.h Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated
Comment thread tests/unit/tls.tcl Outdated
Comment thread src/tls.c Outdated
Comment thread src/tls.c Outdated
@madolson

madolson commented Jan 9, 2026

Copy link
Copy Markdown
Member

Would appreciate any feedback when you have a chance. Thanks!

Thanks! Will take a look at the rest shortly, been a busy start of a year but we should definitely get these merged for 9.1!

@madolson madolson added the release-notes This issue should get a line item in the release notes label Jan 9, 2026
@yang-z-o

Copy link
Copy Markdown
Contributor Author

Thanks Madelyn for the review! Really appreciate you taking the time. I’ll address the comments soon.

Signed-off-by: Yang Zhao <zymy701@gmail.com>
@yang-z-o

yang-z-o commented Jan 12, 2026

Copy link
Copy Markdown
Contributor Author

Planning changes based on the review comments:

  • Introduce a new src/tls.h
  • Move TLS materials checking to the background thread
  • Introduce a structure for pending SSL contexts and their TLS materials metadata
  • Update the cached metadata after the actual parsed contexts are applied in tlsApplyPendingReload

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major decision approved in weekly meeting.

Comment thread valkey.conf
@madolson madolson added major-decision-approved Major decision approved by TSC team and removed major-decision-pending Major decision pending by TSC team labels Jan 12, 2026
@zuiderkwast

Copy link
Copy Markdown
Contributor

Just one question: Key file and cert file need to be updated together, because the private key in the key file needs to match the public key in the certificate. Is there a race here? I mean if you replace the cert file and the key file on disk and the automatic reload kicks in and reloads only one of the files, is that a problem?

I remember we made CONFIG SET accept multiple parameters (in Redis 7.0) and the main benefit of that change was to be able to set tls-cert-file and tls-key-file together atomically.

@yang-z-o

Copy link
Copy Markdown
Contributor Author

I mean if you replace the cert file and the key file on disk and the automatic reload kicks in and reloads only one of the files, is that a problem?

I don’t think this should be a problem in practice. If only one of the cert or key files is updated and an automatic reload is triggered, createSSLContext will fail when the certificate and key don’t match. In that case, we keep using the existing SSL contexts.

With the proposed change to:

Introduce a structure for pending SSL contexts and their TLS materials metadata
Update the cached metadata after the actual parsed contexts are applied in tlsApplyPendingReload

A subsequent reload attempt will still be triggered once both files are updated, so the system can converge to a consistent state.

I remember we made CONFIG SET accept multiple parameters (in Redis 7.0) and the main benefit of that change was to be able to set tls-cert-file and tls-key-file together atomically.

Yes this is a great feature. The mismatch would be problematic for CONFIG SET since that path reloads synchronously and fails fast, but for background auto-reload it should be handled safely.

Signed-off-by: Yang Zhao <zymy701@gmail.com>
… load latest

Signed-off-by: Yang Zhao <zymy701@gmail.com>
Signed-off-by: Yang Zhao <zymy701@gmail.com>
@yang-z-o

Copy link
Copy Markdown
Contributor Author

Hi @madolson @zuiderkwast, thanks again for the review! I’ve addressed the feedback and pushed the updates. Happy to make further changes if needed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/tls.c
Comment thread src/tls.c
Comment thread tests/unit/tls.tcl
Comment thread src/tls.c
Comment thread src/bio.c
Signed-off-by: Yang Zhao <zymy701@gmail.com>

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I only added some minor suggestion about the documentation of the config.

We should also add some information in https://github.com/valkey-io/valkey-doc/blob/main/topics/encryption.md (which btw can need some more improvements. We should mention how to use it first. How to build it can moved be further down or just be in the README file of the valkey repo instead.)

Comment thread valkey.conf Outdated
Signed-off-by: Yang Zhao <zymy701@gmail.com>
@yang-z-o

Copy link
Copy Markdown
Contributor Author

Hi @zuiderkwast, thank you for the review!

We should also add some information in https://github.com/valkey-io/valkey-doc/blob/main/topics/encryption.md (which btw can need some more improvements. We should mention how to use it first. How to build it can moved be further down or just be in the README file of the valkey repo instead.)

Agree the "how to build" part could stay in the valkey repo README, and the doc might be better focused on “how to use”.
I’ll raise a PR to update this page once the feature is released. Just wondering how do we normally track things like this, shall I raise an issue in valkey-doc for now?

@zuiderkwast zuiderkwast added the needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. label Jan 20, 2026
@zuiderkwast

Copy link
Copy Markdown
Contributor

Just wondering how do we normally track things like this, shall I raise an issue in valkey-doc for now?

We don't really need an issue but you can open one if you want to keep track.

We often use the needs-do-pr label to keep track. When you open a doc PR, include a link to this PR and then the links will be visible on both directions.

You could also include any changes for your other TLS related PRs in the same doc PR.

I’ll raise a PR to update this page once the feature is released.

Please don't wait until after Valkey 9.1 is released. We would like the docs to be ready before the GA release date. It's OK to do the docs after this PR is merged though.

@madolson madolson moved this to Todo in Valkey 9.1 Jan 21, 2026

@madolson madolson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for the work!

@madolson madolson merged commit 3c092ca into valkey-io:unstable Jan 21, 2026
57 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in Valkey 9.1 Jan 21, 2026
@madolson

Copy link
Copy Markdown
Member

I’ll raise a PR to update this page once the feature is released.

Agree with Viktor, open the PR whenever for TLS.

arshidkv12 pushed a commit to arshidkv12/valkey that referenced this pull request Jan 23, 2026
### Overview
This PR adds support for automatic background TLS reloading, closes
valkey-io#2649
TLS validity checks and fail-fast behavior on invalid certificates are
handled separately in valkey-io#2999.
- New configuration
  - `tls-auto-reload-interval <seconds>`
  - `0` disabled (default, backward compatible)
  - `>0` check interval in seconds
- TLS materials change detection in background
  - SHA-256 fingerprint checking for certificate files
- `inode + mtime` checking for CA certificate directories and key files
- Skips reload if materials haven't changed
`tlsCheckMaterialsAndUpdateCache`
- TLS contexts reload
- CPU-intensive certificate parsing happens in dedicated BIO worker
thread `BIO_TLS_RELOAD`
  - Main thread never blocks, atomically swaps SSL contexts
- Two-phase reload: background preparation `tlsConfigureAsync` + main
thread application `tlsApplyPendingReload`

**Note**: Original TLS load and reload still remain in main thread using
`tlsConfigureSync`, including:
- Initial TLS load (server startup)
- Runtime reload via CONFIG SET

---------

Signed-off-by: Yang Zhao <zymy701@gmail.com>
Signed-off-by: arshidkv12 <arshidkv12@gmail.com>
@yang-z-o

Copy link
Copy Markdown
Contributor Author

Opened the PR for TLS valkey-io/valkey-doc#402

zuiderkwast pushed a commit to valkey-io/valkey-doc that referenced this pull request Feb 3, 2026
Changes include:
- Unify TLS topic naming: previously some references used “encryption”
while others used “tls”
- Remove source code repo specific information: instructions on building
or running unit tests
- Add information on new TLS feature and behavior:
   - valkey-io/valkey#2999
   - valkey-io/valkey#3020

---------

Signed-off-by: Yang Zhao <zymy701@gmail.com>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
### Overview
This PR adds support for automatic background TLS reloading, closes
valkey-io#2649
TLS validity checks and fail-fast behavior on invalid certificates are
handled separately in valkey-io#2999.
- New configuration
  - `tls-auto-reload-interval <seconds>`
  - `0` disabled (default, backward compatible)
  - `>0` check interval in seconds
- TLS materials change detection in background
  - SHA-256 fingerprint checking for certificate files
- `inode + mtime` checking for CA certificate directories and key files
- Skips reload if materials haven't changed
`tlsCheckMaterialsAndUpdateCache`
- TLS contexts reload
- CPU-intensive certificate parsing happens in dedicated BIO worker
thread `BIO_TLS_RELOAD`
  - Main thread never blocks, atomically swaps SSL contexts
- Two-phase reload: background preparation `tlsConfigureAsync` + main
thread application `tlsApplyPendingReload`


**Note**: Original TLS load and reload still remain in main thread using
`tlsConfigureSync`, including:
- Initial TLS load (server startup)
- Runtime reload via CONFIG SET


---------

Signed-off-by: Yang Zhao <zymy701@gmail.com>
hpatro pushed a commit to hpatro/valkey that referenced this pull request Mar 5, 2026
### Overview
This PR adds support for automatic background TLS reloading, closes
valkey-io#2649
TLS validity checks and fail-fast behavior on invalid certificates are
handled separately in valkey-io#2999.
- New configuration
  - `tls-auto-reload-interval <seconds>`
  - `0` disabled (default, backward compatible)
  - `>0` check interval in seconds
- TLS materials change detection in background
  - SHA-256 fingerprint checking for certificate files
- `inode + mtime` checking for CA certificate directories and key files
- Skips reload if materials haven't changed
`tlsCheckMaterialsAndUpdateCache`
- TLS contexts reload
- CPU-intensive certificate parsing happens in dedicated BIO worker
thread `BIO_TLS_RELOAD`
  - Main thread never blocks, atomically swaps SSL contexts
- Two-phase reload: background preparation `tlsConfigureAsync` + main
thread application `tlsApplyPendingReload`

**Note**: Original TLS load and reload still remain in main thread using
`tlsConfigureSync`, including:
- Initial TLS load (server startup)
- Runtime reload via CONFIG SET

---------

Signed-off-by: Yang Zhao <zymy701@gmail.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
SpaceFarmer added a commit to SUNET/puppet-sunet that referenced this pull request Apr 10, 2026
Automatic reload might replace this need in the future:
valkey-io/valkey#3020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

major-decision-approved Major decision approved by TSC team needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[NEW] Automatic TLS Certificate Reload

5 participants