Skip to content

[BUG] Valkey cluster is duplicating keys when loading AOF #2995

Description

@marcoffee

Describe the bug

When a key appear on .base.rdb and .incr.aof (inside a MULTI/EXEC block and after another command that manipulates another key) it is being duplicated by the server. After that, the primary cannot replicate and all replicas fail with "duplicate key" error. In addition, if the primary saves its memory content (via SAVE or BGREWRITEAOF) and restarts, it crashes at startup, which is serious because after that duplication the server will not be able to restart and stay online (e.g., for an update).

To reproduce

Below is an example of the data folder contents that reproduces the error. I used the latest valkey (9.0.1-trixie, but it also happens on the version compiled from the latest commit on unstable branch) from dockerhub with the flags --cluster-enabled yes --appendonly yes --enable-debug-command yes.

data.zip

If you start the server with this data folder and run KEYS * via valkey-cli it displays

1) "k0"
2) "k1"
3) "k0"

If you run GET k0, it displays

"a"

Another way to simulate this behavior is to start a clean cluster (with the previous flags enabled), add the initial slots via CLUSTER ADDSLOTSRANGE 0 16383 (wait for the cluster state to be "ok") and run the following commands on it:

SET k0 a
BGREWRITEAOF
WAITAOF 1 0 60
SET k1 b
MULTI
SET k0 c
APPEND k0 d
EXEC

It will create a key k0 with "a" as its value, save the AOF on background, wait for it to complete, then insert a key k1 with "b" as its value, and finally start a transaction that resets k0 to "c" and then appends "d" to it. The AOF must not be saved again afterwards to ensure the first SET comes from the .base.rdb file and the following commands come from the .incr.aof file.

Then, restart the server without saving or run DEBUG LOADAOF and then run KEYS * on it. The output should be the same.

You can simulate the crash by either:

  • Restarting the server without rewriting AOF followed by running DEBUG RELOAD or DEBUG LOADAOF;
  • Running DEBUG LOADAOF followed by DEBUG RELOAD;
  • Running DEBUG LOADAOF + BGREWRITEAOF + WAITAOF 1 0 60 followed by running DEBUG LOADAOF.

Also, if you start the server with the flag --enable-debug-assert yes, it crashes instantly with a failed assertion on restart or when you run DEBUG LOADAOF.

Note that, until you restart the server or run DEBUG LOADAOF, it will display the expected result (KEYS * == [k0, k1] and GET k0 == "cd").

Expected behavior

It must not duplicate k0:

  • After running KEYS *, it must return only k0 and k1, without duplication;
  • After running GET k0, it must return "cd".

Additional information

I started debugging the valkey-server code to understand this issue and found out that, on the MULTI/EXEC block, k0 is using the same keyslot from k1 (probably from keyslot caching that is not overwritten during MULTI/EXEC) and, thus, being inserted again at the wrong location.

If you comment out this conditional block the server behaves correctly, but always recompute the keyslot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions