Skip to content

Conversation

@Bouncheck
Copy link

@Bouncheck Bouncheck commented Sep 5, 2025

Adds a new option that allows for throttling addMissingChannels()
in ChannelPool.java. Instead of creating all missing channels at once
for particular channel pool the driver will create batches of limited size
and handle them sequentially.

Setting size 0 means allowing unlimited batches and it will result in the
usual behavior of creating channels all at once.
The default value is 0.

Setting any other size N, will result in driver limiting the batches to
either N, or ceil(target_total_number_of_pool_connections / 2),
whichever is smaller.

The main motivation for this change is to allow for leveraging TLSv1.3
stateless session resumption using session tickets.
Most of the time driver ends up without usable tickets for connecting, mainly
due to the way the management of those tickets is done by
Java itself (see the linked issue for the details).
By running connection in batches it allows the driver to obtain some tickets
for the next batches, thus allowing at least some of the connections to avoid
negotiation.
Note that it works the best only with OpenJDK version 24 and above, since previous versions
do not cache multiple sessions per host to allow for simultaneous resumption.
Oracle OpenJDK 24 stores at most 10 sessions per host.
I did not check other Java vendors.

This feature can also be used outside of NST session resumption situation. Throttling
by itself should also bring some relief in case of mass reconnections.

Fixes #444.

@Bouncheck Bouncheck self-assigned this Sep 5, 2025
@Bouncheck Bouncheck force-pushed the scylla-4.x-throttle-filling-channelPools branch from 8d468f2 to 4680d2d Compare September 5, 2025 15:18
@Bouncheck Bouncheck requested a review from dkropachev September 8, 2025 13:14
Adds a new option that allows for throttling `addMissingChannels()`
in `ChannelPool.java`. Instead of creating all missing channels at once
for particular channel pool the driver will create batches of limited size
and handle them sequentially.

Setting size 0 means allowing unlimited batches and it will result in the
usual behavior of creating channels all at once.
The default value is 0.

Setting any other size `N`, will result in driver limiting the batches to
either `N`, or `ceil(target_total_number_of_pool_connections / 2)`,
whichever is smaller.

The main motivation for this change is to allow for leveraging TLSv1.3
stateless session resumption using session tickets.
Most of the time driver ends up without usable tickets for connecting, mainly
due to the way the management of those tickets is done by Java itself.
By running connection in batches it allows the driver to obtain some tickets
for the next batches, allowing at least some of the connections to avoid
negotiation.
This feature can also be used outside of this situation. Just throttling
should also bring some relief in case of mass reconnections.
Modifies current implementation of `addMissingChannels` in `ChannelPool`
to make use of the newly added option.

Default behavior remains unchanged.
Adds a method to get contact points with default shard aware port
from `ccmBridge`.
Adds `all_reconnections_but_one_should_use_tickets_when_throttled_TLSv13`
to `SessionTicketsIT`.

This test checks whether throttling reconnections to a batch size of 1
helps with the session resumptions. Since the Java 11 cache can only store
information for 1 session resumption this in theory should allow for resumptions
to happen if the connections are established sequentially.
The test relies on server to send NewSessionTicket messages in a timely
manner.
@Bouncheck Bouncheck force-pushed the scylla-4.x-throttle-filling-channelPools branch from 4680d2d to d0c61a0 Compare September 9, 2025 17:34
Copy link

@dkropachev dkropachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, can you please add Fixes: to the PR description and add some information into the issue on why did we solve this issue in such a weird way.

@Bouncheck
Copy link
Author

Added "Fixes".
I'm not sure what else to write. I think it's already described why this approach works and that other solutions like reimplementing Java SSL internals looks to be too big to maintain.

@dkropachev dkropachev merged commit 6932f11 into scylladb:scylla-4.x Sep 16, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4.x: Support TLS tickets for quick TLS renegotiation

2 participants