Offload replication writes to IO threads#1485
Conversation
3aee49b to
846c816
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #1485 +/- ##
============================================
- Coverage 71.06% 71.06% -0.01%
============================================
Files 121 121
Lines 65237 65311 +74
============================================
+ Hits 46360 46411 +51
- Misses 18877 18900 +23
|
There was a problem hiding this comment.
fyi - I tested this code and I got this error. I did not see it with HEAD~1
2576783:S 06 Jan 2025 16:06:12.381 # Protocol error (Master using the inline protocol. Desync?) from client: id=6 addr=127.0.0.1:6379 laddr=127.0.0.1:56284 fd=11 name=*redacted* age=43 idle=0 flags=M db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=22870 qbuf-free=18084 argv-mem=0 multi-mem=0 rbs=1024 rbp=42 obl=0 oll=0 omem=0 tot-mem=42880 events=r cmd=set user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=218001645 tot-net-out=1809 tot-cmds=1267450. Query buffer during protocol error: 'SET..$16..key:__rand_int__..$128..VXKeHogKgJ=[5V9_X^b?48OKF2jGA<' (... more 896 bytes ...) 'mcS2^N1J?ELSX@CfKQ7cM5aea\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\Rm\..'
I did run: src/valkey-benchmark -t set -d 128 -n 5000000 --threads 10
valkey with 4 io threads
Looks like data is corrupted, look at the second message:
*3\r $3\r SET\r $16\r key:000000630274\r $128\r VXKeHogKgJ=[5V9_X^b?48OKF2jGA<f:iR@50o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r *3\r $3\r SET\r $16\r key:000000420097\r $128\r VXKeHogKgJ=[5V9_X^b?48OKF2jGA<f:iR@50o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r
The payload o7dS3JV4Q6L68lC[GTA]0DaMg?_oSmcS2^N1J?ELSX@CfKQ7cM5aea\\ngY8a3LGgNVa9eRA46XS8>7ABe1>Jl9O\\Rm\\\r is duplicated.
Many thanks @xbasel for finding it. |
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
76576ac to
2fb5d11
Compare
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2fb5d11 to
46202aa
Compare
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
|
Use-after-free in unit test (ASAN & Valgrind), in freeReplicaReferencedReplBuffer. https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927928217#step:10:147 |
Fixed in: https://github.com/valkey-io/valkey/pull/1697/files |
This PR offloads the write to replica clients to IO threads. ## Main Changes * Replica writes will be offloaded but only after the replica is in online mode.. * Replica reads will still be done in the main thread to reduce complexity and because read traffic from replicas is negligible. ### Implementation Details In order to offload the writes, `writeToReplica` has been split into 2 parts: 1. The write itself made by the IO thread or by the main thread 2. The post write where we update the replication buffers refcount will be done in the main-thread after the write-job is done in the IO thread (similar to what we do with a regular client) ### Additional Changes * In `writeToReplica` we now use `writev` in case more than 1 buffer exists. * Changed client `nwritten` field to `ssize_t` since with a replica the `nwritten` can theoretically exceed `int` size (not subject to `NET_MAX_WRITES_PER_EVENT` limit). * Changed parsing code to use `memchr` instead of `strchr`: * During parsing command, ASAN got stuck for unknown reason when called to `strchr` to look for the next `\r` * Adding assert for null-terminated querybuf didn't resolve the issue. * Switched to `memchr` as it's more secure and resolves the issue ### Testing * Added integration tests * Added unit tests **Related issue:** valkey-io#761 --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com>
This PR offloads the write to replica clients to IO threads. ## Main Changes * Replica writes will be offloaded but only after the replica is in online mode.. * Replica reads will still be done in the main thread to reduce complexity and because read traffic from replicas is negligible. ### Implementation Details In order to offload the writes, `writeToReplica` has been split into 2 parts: 1. The write itself made by the IO thread or by the main thread 2. The post write where we update the replication buffers refcount will be done in the main-thread after the write-job is done in the IO thread (similar to what we do with a regular client) ### Additional Changes * In `writeToReplica` we now use `writev` in case more than 1 buffer exists. * Changed client `nwritten` field to `ssize_t` since with a replica the `nwritten` can theoretically exceed `int` size (not subject to `NET_MAX_WRITES_PER_EVENT` limit). * Changed parsing code to use `memchr` instead of `strchr`: * During parsing command, ASAN got stuck for unknown reason when called to `strchr` to look for the next `\r` * Adding assert for null-terminated querybuf didn't resolve the issue. * Switched to `memchr` as it's more secure and resolves the issue ### Testing * Added integration tests * Added unit tests **Related issue:** valkey-io#761 --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com>
This PR offloads the write to replica clients to IO threads. ## Main Changes * Replica writes will be offloaded but only after the replica is in online mode.. * Replica reads will still be done in the main thread to reduce complexity and because read traffic from replicas is negligible. ### Implementation Details In order to offload the writes, `writeToReplica` has been split into 2 parts: 1. The write itself made by the IO thread or by the main thread 2. The post write where we update the replication buffers refcount will be done in the main-thread after the write-job is done in the IO thread (similar to what we do with a regular client) ### Additional Changes * In `writeToReplica` we now use `writev` in case more than 1 buffer exists. * Changed client `nwritten` field to `ssize_t` since with a replica the `nwritten` can theoretically exceed `int` size (not subject to `NET_MAX_WRITES_PER_EVENT` limit). * Changed parsing code to use `memchr` instead of `strchr`: * During parsing command, ASAN got stuck for unknown reason when called to `strchr` to look for the next `\r` * Adding assert for null-terminated querybuf didn't resolve the issue. * Switched to `memchr` as it's more secure and resolves the issue ### Testing * Added integration tests * Added unit tests **Related issue:** valkey-io#761 --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com>
This PR offloads the write to replica clients to IO threads.
Main Changes
Implementation Details
In order to offload the writes,
writeToReplicahas been split into 2 parts:Additional Changes
writeToReplicawe now usewritevin case more than 1 buffer exists.nwrittenfield tossize_tsince with a replica thenwrittencan theoretically exceedintsize (not subject toNET_MAX_WRITES_PER_EVENTlimit).memchrinstead ofstrchr:strchrto look for the next\rmemchras it's more secure and resolves the issueTesting
Related issue: #761