Describe the bug
When a key appear on .base.rdb and .incr.aof (inside a MULTI/EXEC block and after another command that manipulates another key) it is being duplicated by the server. After that, the primary cannot replicate and all replicas fail with "duplicate key" error. In addition, if the primary saves its memory content (via SAVE or BGREWRITEAOF) and restarts, it crashes at startup, which is serious because after that duplication the server will not be able to restart and stay online (e.g., for an update).
To reproduce
Below is an example of the data folder contents that reproduces the error. I used the latest valkey (9.0.1-trixie, but it also happens on the version compiled from the latest commit on unstable branch) from dockerhub with the flags --cluster-enabled yes --appendonly yes --enable-debug-command yes.
data.zip
If you start the server with this data folder and run KEYS * via valkey-cli it displays
If you run GET k0, it displays
Another way to simulate this behavior is to start a clean cluster (with the previous flags enabled), add the initial slots via CLUSTER ADDSLOTSRANGE 0 16383 (wait for the cluster state to be "ok") and run the following commands on it:
SET k0 a
BGREWRITEAOF
WAITAOF 1 0 60
SET k1 b
MULTI
SET k0 c
APPEND k0 d
EXEC
It will create a key k0 with "a" as its value, save the AOF on background, wait for it to complete, then insert a key k1 with "b" as its value, and finally start a transaction that resets k0 to "c" and then appends "d" to it. The AOF must not be saved again afterwards to ensure the first SET comes from the .base.rdb file and the following commands come from the .incr.aof file.
Then, restart the server without saving or run DEBUG LOADAOF and then run KEYS * on it. The output should be the same.
You can simulate the crash by either:
- Restarting the server without rewriting AOF followed by running
DEBUG RELOAD or DEBUG LOADAOF;
- Running
DEBUG LOADAOF followed by DEBUG RELOAD;
- Running
DEBUG LOADAOF + BGREWRITEAOF + WAITAOF 1 0 60 followed by running DEBUG LOADAOF.
Also, if you start the server with the flag --enable-debug-assert yes, it crashes instantly with a failed assertion on restart or when you run DEBUG LOADAOF.
Note that, until you restart the server or run DEBUG LOADAOF, it will display the expected result (KEYS * == [k0, k1] and GET k0 == "cd").
Expected behavior
It must not duplicate k0:
- After running
KEYS *, it must return only k0 and k1, without duplication;
- After running
GET k0, it must return "cd".
Additional information
I started debugging the valkey-server code to understand this issue and found out that, on the MULTI/EXEC block, k0 is using the same keyslot from k1 (probably from keyslot caching that is not overwritten during MULTI/EXEC) and, thus, being inserted again at the wrong location.
If you comment out this conditional block the server behaves correctly, but always recompute the keyslot.
Describe the bug
When a key appear on
.base.rdband.incr.aof(inside aMULTI/EXECblock and after another command that manipulates another key) it is being duplicated by the server. After that, the primary cannot replicate and all replicas fail with "duplicate key" error. In addition, if the primary saves its memory content (viaSAVEorBGREWRITEAOF) and restarts, it crashes at startup, which is serious because after that duplication the server will not be able to restart and stay online (e.g., for an update).To reproduce
Below is an example of the data folder contents that reproduces the error. I used the latest valkey (
9.0.1-trixie, but it also happens on the version compiled from the latest commit onunstablebranch) from dockerhub with the flags--cluster-enabled yes --appendonly yes --enable-debug-command yes.data.zip
If you start the server with this data folder and run
KEYS *viavalkey-cliit displaysIf you run
GET k0, it displaysAnother way to simulate this behavior is to start a clean cluster (with the previous flags enabled), add the initial slots via
CLUSTER ADDSLOTSRANGE 0 16383(wait for the cluster state to be"ok") and run the following commands on it:It will create a key
k0with"a"as its value, save the AOF on background, wait for it to complete, then insert a keyk1with"b"as its value, and finally start a transaction that resetsk0to"c"and then appends"d"to it. The AOF must not be saved again afterwards to ensure the firstSETcomes from the.base.rdbfile and the following commands come from the.incr.aoffile.Then, restart the server without saving or run
DEBUG LOADAOFand then runKEYS *on it. The output should be the same.You can simulate the crash by either:
DEBUG RELOADorDEBUG LOADAOF;DEBUG LOADAOFfollowed byDEBUG RELOAD;DEBUG LOADAOF+BGREWRITEAOF+WAITAOF 1 0 60followed by runningDEBUG LOADAOF.Also, if you start the server with the flag
--enable-debug-assert yes, it crashes instantly with a failed assertion on restart or when you runDEBUG LOADAOF.Note that, until you restart the server or run
DEBUG LOADAOF, it will display the expected result (KEYS *==[k0, k1]andGET k0=="cd").Expected behavior
It must not duplicate
k0:KEYS *, it must return onlyk0andk1, without duplication;GET k0, it must return"cd".Additional information
I started debugging the valkey-server code to understand this issue and found out that, on the
MULTI/EXECblock,k0is using the same keyslot fromk1(probably from keyslot caching that is not overwritten duringMULTI/EXEC) and, thus, being inserted again at the wrong location.If you comment out this conditional block the server behaves correctly, but always recompute the keyslot.