rgw: beast add ssl hot-reload#65842
Conversation
|
jenkins test make check arm64 |
cbodley
left a comment
There was a problem hiding this comment.
thanks, i like the use of shared_ptr to support multiple ssl contexts so requests don't have to block during the reload 👍
cbodley
left a comment
There was a problem hiding this comment.
minor style/formatting feedback based on https://github.com/ceph/ceph/blob/main/CodingStyle
|
jenkins test make check |
|
jenkins test make check arm64 |
mkogan1
left a comment
There was a problem hiding this comment.
Tested working OK - LGTM
@cbodley a note about keep-alive -
configured to rotate every 30 seconds:
./bin/radosgw ... --rgw_frontends="beast port=8000 tcp_nodelay=1 request_timeout_ms=0 ssl_port=8443 ssl_certificate=./rgw.crt ssl_private_key=./rgw.key ssl_reload=30" -f
an hsbench workload that takes longer than 30 seconds to complete (like the example below) was not interrupted while running - it was able to complete successfully even that the cert was rotated during it's execution, the next execution iteration failed because of a certificate error as expected
❯ time (nice numactl -N 1 -m 1 -- env SSL_CERT_FILE=./rgw.pem ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u https://127.0.0.1:8443 -z 4K -d -1 -t $(( $(numactl -N 0 -- nproc) / 2 )) -b $(( $(numactl -N 0 -- nproc) * 2 )) -n 100000 -m cxip -bp b01b- -op 'folder01/stage01_' |& tee hsbench.log | stdbuf -o0 colrm $((${COLUMNS} - 1)) | ccze -Aonolookups)
2025/10/16 12:05:50 Hotsauce S3 Benchmark Version 0.1
2025/10/16 12:05:50 Parameters:
2025/10/16 12:05:50 url=https://127.0.0.1:8443
2025/10/16 12:05:50 object_prefix=folder01/stage01_
2025/10/16 12:05:50 bucket_prefix=b01b-
2025/10/16 12:05:50 region=us-east-1
2025/10/16 12:05:50 modes=cxip
2025/10/16 12:05:50 output=
2025/10/16 12:05:50 json_output=
2025/10/16 12:05:50 max_keys=1000
2025/10/16 12:05:50 object_count=100000
2025/10/16 12:05:50 bucket_count=32
2025/10/16 12:05:50 duration=-1
2025/10/16 12:05:50 threads=8
2025/10/16 12:05:50 loops=1
2025/10/16 12:05:50 size=4K
2025/10/16 12:05:50 interval=1.000000
2025/10/16 12:05:50 Running Loop 0 BUCKET CLEAR TEST
2025/10/16 12:05:50 Loop: 0, Int: TOTAL, Dur(s): 0.5, Mode: BCLR, Ops: 0, MB/s: 0.00, IO/s: 0, Lat(ms): [ min: 0.0, avg: 0.0, 99%: 0.0, max: 0.0 ], Slowdowns: 0
2025/10/16 12:05:50 Running Loop 0 BUCKET DELETE TEST
2025/10/16 12:05:51 Loop: 0, Int: TOTAL, Dur(s): 0.4, Mode: BDEL, Ops: 0, MB/s: 0.00, IO/s: 0, Lat(ms): [ min: 0.0, avg: 0.0, 99%: 0.0, max: 0.0 ], Slowdowns: 0
2025/10/16 12:05:51 Running Loop 0 BUCKET INIT TEST
2025/10/16 12:05:51 FATAL: Unable to create bucket b01b-000000000001 (is your access and secret correct?): RequestError: send request failed
caused by: Put https://127.0.0.1:8443/b01b-000000000001: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate au
( nice numactl -N 1 -m 1 -- env SSL_CERT_FILE=./rgw.pem ~/go/bin/hsbench -a ) 0.22s user 0.10s system 25% cpu 1.255 total
thanks @mkogan1, that's an interesting point. we could potentially disable keepalive on any connections using the old cert to force them to reconnect with the new one. but that would require some additional synchronization that i'm not sure would be worth the complexity |
Adds the `ssl_reload` config option to the beast frontend. This sets an interval in seconds to periodically reload the ssl context to pick up changes without restarting. It can be disabled (default) be setting it to `0`. Fixes: 65470 Signed-off-by: Henry Richter <henry.richter@cloudandheat.com>
|
PR testing completed and approved by @ivancich. Details : https://tracker.ceph.com/issues/73623 |
|
This is an automated message by src/script/redmine-upkeep.py. I have resolved the following tracker ticket due to the merge of this PR: No backports are pending for the ticket. If this is incorrect, please update the tracker Update Log: https://github.com/ceph/ceph/actions/runs/18941318909 |
|
thanks @h3nryc0ding. does rook want this backported? |
| :Description: Optional interval in seconds to periodically recreate the SSL | ||
| context, which reloads the SSL certificate and private key from | ||
| their specified paths. A value of ``0`` disables this feature. | ||
| The reload is non-disruptive to existing connections. If the re- |
There was a problem hiding this comment.
Note for future PRs, please don’t hyphenate words for line breaks, leave them whole.
There was a problem hiding this comment.
Ah, I see what happened. Apparently I forget the -s for fold. Sorry about that!
There was a problem hiding this comment.
I'd rather see a line go over 80 chars than hardcoded hyphenation.
There was a problem hiding this comment.
#66112 entered to fix, as this PR was merged before it could be fixed.
There was a problem hiding this comment.
Thanks for fixing! I'll definitely keep it in mind for future PRs.
| #ifdef __cpp_lib_atomic_shared_ptr | ||
| const auto ssl_ctx = ssl_context.load(std::memory_order_acquire); | ||
| #else | ||
| const auto ssl_ctx = std::atomic_load_excplicit(&ssl_context, std::memory_order_acquire); |
There was a problem hiding this comment.
atomic_load_excplicit
this typo somehow made it through qa, and ubuntu jammy builds are broken on main as a result. i've opened #66113 with a fix
There was a problem hiding this comment.
Yeah, sorry about that. I think #66107 already addresses this.
Adds the
ssl_reloadconfig option to the beast frontend.This sets an interval in seconds to periodically reload the ssl context to pick up changes without restarting. It can be disabled (default) be setting it to
0.Its implemented slightly different the the previous
ssl_short_trustto have less performance impact.Resolves:
Unblocks:
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.