Skip to content

Offload reading the replication stream to IO threads#1449

Merged
zuiderkwast merged 2 commits into
valkey-io:unstablefrom
uriyage:offload-primary-io
Jan 2, 2025
Merged

Offload reading the replication stream to IO threads#1449
zuiderkwast merged 2 commits into
valkey-io:unstablefrom
uriyage:offload-primary-io

Conversation

@uriyage

@uriyage uriyage commented Dec 17, 2024

Copy link
Copy Markdown
Contributor

Support Primary client IO offload.

Related issue: #761

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
@codecov

codecov Bot commented Dec 17, 2024

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 68.75000% with 5 lines in your changes missing coverage. Please review.

Project coverage is 70.78%. Comparing base (980a801) to head (39f0a48).
Report is 29 commits behind head on unstable.

Files with missing lines Patch % Lines
src/io_threads.c 0.00% 3 Missing ⚠️
src/networking.c 81.81% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #1449   +/-   ##
=========================================
  Coverage     70.78%   70.78%           
=========================================
  Files           119      119           
  Lines         64691    64740   +49     
=========================================
+ Hits          45790    45829   +39     
- Misses        18901    18911   +10     
Files with missing lines Coverage Δ
src/replication.c 87.43% <100.00%> (-0.08%) ⬇️
src/server.h 100.00% <ø> (ø)
src/networking.c 88.35% <81.81%> (+0.06%) ⬆️
src/io_threads.c 6.94% <0.00%> (-0.58%) ⬇️

... and 20 files with indirect coverage changes

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a small diff! Nice.

Comment thread src/io_threads.c Outdated
@zuiderkwast

Copy link
Copy Markdown
Contributor

The PR title isn't very clear for someone who doesn't know the internals.

I believe a user could better understand a title like "Offload reading the replication stream to IO threads".

@zuiderkwast zuiderkwast added the release-notes This issue should get a line item in the release notes label Dec 17, 2024
Comment thread src/networking.c
@enjoy-binbin enjoy-binbin added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Dec 18, 2024
Comment thread src/replication.c Outdated
@uriyage uriyage changed the title Support primary IO offload to IO threads Offload reading the replication stream to IO threads Dec 18, 2024
@uriyage

uriyage commented Dec 18, 2024

Copy link
Copy Markdown
Contributor Author

The PR title isn't very clear for someone who doesn't know the internals.

I believe a user could better understand a title like "Offload reading the replication stream to IO threads".

Thanks, changed.

Signed-off-by: Uri Yagelnik <uriy@amazon.com>

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I will not merge it yet in case others want to review it a bit more.

@jdheyburn

Copy link
Copy Markdown
Contributor

To the layman such as myself, what benefit does this introduce WRT replication?

@zuiderkwast

Copy link
Copy Markdown
Contributor

To the layman such as myself, what benefit does this introduce WRT replication?

Replicas can serve more readonly traffic when reading the replication stream can be offloaded to an IO thread instead of being handled by the bottleneck main thread. It's useful in setups where replicas are used for readonly traffic.

It seems more beneficial to let primaries offload the writes to replicas to IO threads. Hopefully, that will come in another PR. @uriyage WDYT?

@uriyage

uriyage commented Dec 22, 2024

Copy link
Copy Markdown
Contributor Author

To the layman such as myself, what benefit does this introduce WRT replication?

Replicas can serve more readonly traffic when reading the replication stream can be offloaded to an IO thread instead of being handled by the bottleneck main thread. It's useful in setups where replicas are used for readonly traffic.

It seems more beneficial to let primaries offload the writes to replicas to IO threads. Hopefully, that will come in another PR. @uriyage WDYT?

@zuiderkwast Yes, I will open a new PR in the coming days for offloading the writes to replicas. It is a bit more complicated, which is why I started with the primary writes offload.

@ranshid

ranshid commented Jan 1, 2025

Copy link
Copy Markdown
Member

@uriyage I think we need to check the failed test in TLS which I can see:

Error writing to client: error:0A00010F:SSL routines::bad length

Maybe it is related (is it the use of writev?)

@ranshid

ranshid commented Jan 1, 2025

Copy link
Copy Markdown
Member

I rerun the test to see if this is a stable fail

@uriyage

uriyage commented Jan 2, 2025

Copy link
Copy Markdown
Contributor Author

I rerun the test to see if this is a stable fail

The test runs without IO threads, in this case, the PR changes have no effect, meaning the failure is unrelated to this PR changes

@zuiderkwast zuiderkwast merged commit 35abb68 into valkey-io:unstable Jan 2, 2025
@zuiderkwast

Copy link
Copy Markdown
Contributor

I don't want to block PRs that are ready to merge. We should really focus on the failing tests instead of starting too many new features though.

kronwerk pushed a commit to kronwerk/valkey that referenced this pull request Jan 27, 2025
Support Primary client IO offload.

Related issue: valkey-io#761

---------

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants