Reuse RDB file as AOF preamble on replica after disk-based full sync#1901
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #1901 +/- ##
================================
================================
🚀 New features to boost your workflow:
|
|
relying on slow failure-prone components like disks is not a good choice for a high-speed kv engine. For example, Redis 7.0 has been using diskless replication by default since then, which is exactly why they went with this approach |
96eaf5a to
0be9fb9
Compare
I understand your point about diskless replication. However, this PR only optimizes a specific configuration where disk-based replication is used with aof-use-rdb-preamble enabled. In this scenario, we're currently deleting an RDB file only to recreate an identical one immediately after. This PR simply eliminates this redundancy without changing any default behaviors. |
5cf6232 to
0be9fb9
Compare
thx for your kind reply. I have checked the code and get what this pr want |
soloestoy
left a comment
There was a problem hiding this comment.
generally LGTM, just need some minor changes.
|
reuse the sync file as the base AOF file can save lots of resource usage, @valkey-io/core-team PTAL. |
efa9a9f to
18f1bf9
Compare
|
With this change the RDB file that is re-used as the base snapshot of the AOF file, will have it's |
|
This looks like a good idea, but unfortunately it's too late now to make it to 9.0 so I'm punting it to 9.1. @soloestoy do you plan to follow up on this review? |
…ration (#2635) # Problem In the current slot migration design, replicas are completely unaware of the slot migration. Because of this, they do not know to hide importing keys, which results in exposure of these keys to commands like KEYS, SCAN, RANDOMKEY, and DBSIZE. # Design The main part of the design is that we will now listen for and process the `SYNCSLOTS ESTABLISH` command on the replica. When a `SYNCSLOTS ESTABLISH` command is received from the primary, we begin tracking a new slot import in a special `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. Replicas use this state to track the import, and await for a future `SYNCSLOTS FINISH` message that tells them the import is successful/failed. ## Success Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | | | | |~~~~~~~~~~~~~~ snapshot as AOF ~~~~~~~~~~~~~~~~>| | | |~~~~~~ forward snapshot ~~~~~~~~>| |----------- SYNCSLOTS SNAPSHOT-EOF ------------>| | | | | |<----------- SYNCSLOTS REQUEST-PAUSE -----------| | | | | |~~~~~~~~~~~~ incremental changes ~~~~~~~~~~~~~~>| | | |~~~~~~ forward changes ~~~~~~~~~>| |--------------- SYNCSLOTS PAUSED -------------->| | | | | |<---------- SYNCSLOTS REQUEST-FAILOVER ---------| | | | | |---------- SYNCSLOTS FAILOVER-GRANTED --------->| | | | | | (performs takeover & | | propagates topology) | | | | | |------- SYNCSLOTS FINISH ------->| (finds out about topology | | change & marks migration done) | | | | | ``` ## Failure Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | ... ... ... | | | | <FAILURE> | | | | | (performs cleanup) | | | ~~~~~~ UNLINK <key> ... ~~~~~~~>| | | | | | ------ SYNCSLOTS FINISH ------->| | | | ``` ## Full Sync, Partial Sync, and RDB In order to ensure replicas that resync during the import are still aware of the import, the slot import is serialized to a new `cluster-slot-imports` aux field. The encoding includes the job name, the source node name, and the slot ranges being imported. Upon loading an RDB with the `cluster-slot-imports` aux field, replicas will add a new migration in the `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. It's important to note that a previously saved RDB file can be used as the basis for partial sync with a primary. Because of this, whenever we load an RDB file with the `cluster-slot-imports` aux field, even from disk, we will still add a new migration to track the import. If after loading the RDB, the Valkey node is a primary, it will cancel the slot migration. Having this tracking state loaded on primaries will ensure that replicas partial syncing to a restarted primary still get their `SYNCSLOTS FINISH` message in the replication stream. ## AOF Since AOF cannot be used as the basis for a partial sync, we don't necessarily need to persist the `SYNCSLOTS ESTABLISH` and `FINISH` commands to the AOF. However, considering there is work to change this (#59 #1901) this design doesn't make any assumptions about this. We will propagate the `ESTABLISH` and `FINISH` commands to the AOF, and ensure that they can be properly replayed on AOF load to get to the right state. Similar to RDB, if there are any pending "ESTABLISH" commands that don't have a "FINISH" afterwards upon becoming primary, we will make sure to fail those in `verifyClusterConfigWithData`. Additionally, there was a bug in the existing slot migration where slot import clients were not having their commands persisted to AOF. This has been fixed by ensuring we still propagate to AOF even for slot import clients. ## Promotion & Demotion Since the primary is solely responsible for cleaning up unowned slots, primaries that are demoted will not clean up previously active slot imports. The promoted replica will be responsible for both cleaning up the slot (`verifyClusterConifgWithData`) and sending a `SYNCSLOTS FINISH`. # Other Options Considered I also considered tracking "dirty" slots rather than using the slot import state machine. In this setup, primaries and replicas would simply mark each slot's hashtable in the kvstore as dirty when something is written to it and we do not currently own that slot. This approach is simpler, but has a problem in that modules loaded on the replica would still not get slot migration start/end notifications. If the modules on the replica do not get such notifications, they will not be able to properly contain these dirty keys during slot migration events. --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com>
…ration (valkey-io#2635) # Problem In the current slot migration design, replicas are completely unaware of the slot migration. Because of this, they do not know to hide importing keys, which results in exposure of these keys to commands like KEYS, SCAN, RANDOMKEY, and DBSIZE. # Design The main part of the design is that we will now listen for and process the `SYNCSLOTS ESTABLISH` command on the replica. When a `SYNCSLOTS ESTABLISH` command is received from the primary, we begin tracking a new slot import in a special `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. Replicas use this state to track the import, and await for a future `SYNCSLOTS FINISH` message that tells them the import is successful/failed. ## Success Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | | | | |~~~~~~~~~~~~~~ snapshot as AOF ~~~~~~~~~~~~~~~~>| | | |~~~~~~ forward snapshot ~~~~~~~~>| |----------- SYNCSLOTS SNAPSHOT-EOF ------------>| | | | | |<----------- SYNCSLOTS REQUEST-PAUSE -----------| | | | | |~~~~~~~~~~~~ incremental changes ~~~~~~~~~~~~~~>| | | |~~~~~~ forward changes ~~~~~~~~~>| |--------------- SYNCSLOTS PAUSED -------------->| | | | | |<---------- SYNCSLOTS REQUEST-FAILOVER ---------| | | | | |---------- SYNCSLOTS FAILOVER-GRANTED --------->| | | | | | (performs takeover & | | propagates topology) | | | | | |------- SYNCSLOTS FINISH ------->| (finds out about topology | | change & marks migration done) | | | | | ``` ## Failure Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | ... ... ... | | | | <FAILURE> | | | | | (performs cleanup) | | | ~~~~~~ UNLINK <key> ... ~~~~~~~>| | | | | | ------ SYNCSLOTS FINISH ------->| | | | ``` ## Full Sync, Partial Sync, and RDB In order to ensure replicas that resync during the import are still aware of the import, the slot import is serialized to a new `cluster-slot-imports` aux field. The encoding includes the job name, the source node name, and the slot ranges being imported. Upon loading an RDB with the `cluster-slot-imports` aux field, replicas will add a new migration in the `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. It's important to note that a previously saved RDB file can be used as the basis for partial sync with a primary. Because of this, whenever we load an RDB file with the `cluster-slot-imports` aux field, even from disk, we will still add a new migration to track the import. If after loading the RDB, the Valkey node is a primary, it will cancel the slot migration. Having this tracking state loaded on primaries will ensure that replicas partial syncing to a restarted primary still get their `SYNCSLOTS FINISH` message in the replication stream. ## AOF Since AOF cannot be used as the basis for a partial sync, we don't necessarily need to persist the `SYNCSLOTS ESTABLISH` and `FINISH` commands to the AOF. However, considering there is work to change this (valkey-io#59 valkey-io#1901) this design doesn't make any assumptions about this. We will propagate the `ESTABLISH` and `FINISH` commands to the AOF, and ensure that they can be properly replayed on AOF load to get to the right state. Similar to RDB, if there are any pending "ESTABLISH" commands that don't have a "FINISH" afterwards upon becoming primary, we will make sure to fail those in `verifyClusterConfigWithData`. Additionally, there was a bug in the existing slot migration where slot import clients were not having their commands persisted to AOF. This has been fixed by ensuring we still propagate to AOF even for slot import clients. ## Promotion & Demotion Since the primary is solely responsible for cleaning up unowned slots, primaries that are demoted will not clean up previously active slot imports. The promoted replica will be responsible for both cleaning up the slot (`verifyClusterConifgWithData`) and sending a `SYNCSLOTS FINISH`. # Other Options Considered I also considered tracking "dirty" slots rather than using the slot import state machine. In this setup, primaries and replicas would simply mark each slot's hashtable in the kvstore as dirty when something is written to it and we do not currently own that slot. This approach is simpler, but has a problem in that modules loaded on the replica would still not get slot migration start/end notifications. If the modules on the replica do not get such notifications, they will not be able to properly contain these dirty keys during slot migration events. --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com>
…ration (valkey-io#2635) # Problem In the current slot migration design, replicas are completely unaware of the slot migration. Because of this, they do not know to hide importing keys, which results in exposure of these keys to commands like KEYS, SCAN, RANDOMKEY, and DBSIZE. # Design The main part of the design is that we will now listen for and process the `SYNCSLOTS ESTABLISH` command on the replica. When a `SYNCSLOTS ESTABLISH` command is received from the primary, we begin tracking a new slot import in a special `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. Replicas use this state to track the import, and await for a future `SYNCSLOTS FINISH` message that tells them the import is successful/failed. ## Success Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | | | | |~~~~~~~~~~~~~~ snapshot as AOF ~~~~~~~~~~~~~~~~>| | | |~~~~~~ forward snapshot ~~~~~~~~>| |----------- SYNCSLOTS SNAPSHOT-EOF ------------>| | | | | |<----------- SYNCSLOTS REQUEST-PAUSE -----------| | | | | |~~~~~~~~~~~~ incremental changes ~~~~~~~~~~~~~~>| | | |~~~~~~ forward changes ~~~~~~~~~>| |--------------- SYNCSLOTS PAUSED -------------->| | | | | |<---------- SYNCSLOTS REQUEST-FAILOVER ---------| | | | | |---------- SYNCSLOTS FAILOVER-GRANTED --------->| | | | | | (performs takeover & | | propagates topology) | | | | | |------- SYNCSLOTS FINISH ------->| (finds out about topology | | change & marks migration done) | | | | | ``` ## Failure Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | ... ... ... | | | | <FAILURE> | | | | | (performs cleanup) | | | ~~~~~~ UNLINK <key> ... ~~~~~~~>| | | | | | ------ SYNCSLOTS FINISH ------->| | | | ``` ## Full Sync, Partial Sync, and RDB In order to ensure replicas that resync during the import are still aware of the import, the slot import is serialized to a new `cluster-slot-imports` aux field. The encoding includes the job name, the source node name, and the slot ranges being imported. Upon loading an RDB with the `cluster-slot-imports` aux field, replicas will add a new migration in the `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. It's important to note that a previously saved RDB file can be used as the basis for partial sync with a primary. Because of this, whenever we load an RDB file with the `cluster-slot-imports` aux field, even from disk, we will still add a new migration to track the import. If after loading the RDB, the Valkey node is a primary, it will cancel the slot migration. Having this tracking state loaded on primaries will ensure that replicas partial syncing to a restarted primary still get their `SYNCSLOTS FINISH` message in the replication stream. ## AOF Since AOF cannot be used as the basis for a partial sync, we don't necessarily need to persist the `SYNCSLOTS ESTABLISH` and `FINISH` commands to the AOF. However, considering there is work to change this (valkey-io#59 valkey-io#1901) this design doesn't make any assumptions about this. We will propagate the `ESTABLISH` and `FINISH` commands to the AOF, and ensure that they can be properly replayed on AOF load to get to the right state. Similar to RDB, if there are any pending "ESTABLISH" commands that don't have a "FINISH" afterwards upon becoming primary, we will make sure to fail those in `verifyClusterConfigWithData`. Additionally, there was a bug in the existing slot migration where slot import clients were not having their commands persisted to AOF. This has been fixed by ensuring we still propagate to AOF even for slot import clients. ## Promotion & Demotion Since the primary is solely responsible for cleaning up unowned slots, primaries that are demoted will not clean up previously active slot imports. The promoted replica will be responsible for both cleaning up the slot (`verifyClusterConifgWithData`) and sending a `SYNCSLOTS FINISH`. # Other Options Considered I also considered tracking "dirty" slots rather than using the slot import state machine. In this setup, primaries and replicas would simply mark each slot's hashtable in the kvstore as dirty when something is written to it and we do not currently own that slot. This approach is simpler, but has a problem in that modules loaded on the replica would still not get slot migration start/end notifications. If the modules on the replica do not get such notifications, they will not be able to properly contain these dirty keys during slot migration events. --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com>
…ration (#2635) # Problem In the current slot migration design, replicas are completely unaware of the slot migration. Because of this, they do not know to hide importing keys, which results in exposure of these keys to commands like KEYS, SCAN, RANDOMKEY, and DBSIZE. # Design The main part of the design is that we will now listen for and process the `SYNCSLOTS ESTABLISH` command on the replica. When a `SYNCSLOTS ESTABLISH` command is received from the primary, we begin tracking a new slot import in a special `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. Replicas use this state to track the import, and await for a future `SYNCSLOTS FINISH` message that tells them the import is successful/failed. ## Success Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | | | | |~~~~~~~~~~~~~~ snapshot as AOF ~~~~~~~~~~~~~~~~>| | | |~~~~~~ forward snapshot ~~~~~~~~>| |----------- SYNCSLOTS SNAPSHOT-EOF ------------>| | | | | |<----------- SYNCSLOTS REQUEST-PAUSE -----------| | | | | |~~~~~~~~~~~~ incremental changes ~~~~~~~~~~~~~~>| | | |~~~~~~ forward changes ~~~~~~~~~>| |--------------- SYNCSLOTS PAUSED -------------->| | | | | |<---------- SYNCSLOTS REQUEST-FAILOVER ---------| | | | | |---------- SYNCSLOTS FAILOVER-GRANTED --------->| | | | | | (performs takeover & | | propagates topology) | | | | | |------- SYNCSLOTS FINISH ------->| (finds out about topology | | change & marks migration done) | | | | | ``` ## Failure Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | ... ... ... | | | | <FAILURE> | | | | | (performs cleanup) | | | ~~~~~~ UNLINK <key> ... ~~~~~~~>| | | | | | ------ SYNCSLOTS FINISH ------->| | | | ``` ## Full Sync, Partial Sync, and RDB In order to ensure replicas that resync during the import are still aware of the import, the slot import is serialized to a new `cluster-slot-imports` aux field. The encoding includes the job name, the source node name, and the slot ranges being imported. Upon loading an RDB with the `cluster-slot-imports` aux field, replicas will add a new migration in the `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. It's important to note that a previously saved RDB file can be used as the basis for partial sync with a primary. Because of this, whenever we load an RDB file with the `cluster-slot-imports` aux field, even from disk, we will still add a new migration to track the import. If after loading the RDB, the Valkey node is a primary, it will cancel the slot migration. Having this tracking state loaded on primaries will ensure that replicas partial syncing to a restarted primary still get their `SYNCSLOTS FINISH` message in the replication stream. ## AOF Since AOF cannot be used as the basis for a partial sync, we don't necessarily need to persist the `SYNCSLOTS ESTABLISH` and `FINISH` commands to the AOF. However, considering there is work to change this (#59 #1901) this design doesn't make any assumptions about this. We will propagate the `ESTABLISH` and `FINISH` commands to the AOF, and ensure that they can be properly replayed on AOF load to get to the right state. Similar to RDB, if there are any pending "ESTABLISH" commands that don't have a "FINISH" afterwards upon becoming primary, we will make sure to fail those in `verifyClusterConfigWithData`. Additionally, there was a bug in the existing slot migration where slot import clients were not having their commands persisted to AOF. This has been fixed by ensuring we still propagate to AOF even for slot import clients. ## Promotion & Demotion Since the primary is solely responsible for cleaning up unowned slots, primaries that are demoted will not clean up previously active slot imports. The promoted replica will be responsible for both cleaning up the slot (`verifyClusterConifgWithData`) and sending a `SYNCSLOTS FINISH`. # Other Options Considered I also considered tracking "dirty" slots rather than using the slot import state machine. In this setup, primaries and replicas would simply mark each slot's hashtable in the kvstore as dirty when something is written to it and we do not currently own that slot. This approach is simpler, but has a problem in that modules loaded on the replica would still not get slot migration start/end notifications. If the modules on the replica do not get such notifications, they will not be able to properly contain these dirty keys during slot migration events. --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com>
…ration (valkey-io#2635) # Problem In the current slot migration design, replicas are completely unaware of the slot migration. Because of this, they do not know to hide importing keys, which results in exposure of these keys to commands like KEYS, SCAN, RANDOMKEY, and DBSIZE. # Design The main part of the design is that we will now listen for and process the `SYNCSLOTS ESTABLISH` command on the replica. When a `SYNCSLOTS ESTABLISH` command is received from the primary, we begin tracking a new slot import in a special `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. Replicas use this state to track the import, and await for a future `SYNCSLOTS FINISH` message that tells them the import is successful/failed. ## Success Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | | | | |~~~~~~~~~~~~~~ snapshot as AOF ~~~~~~~~~~~~~~~~>| | | |~~~~~~ forward snapshot ~~~~~~~~>| |----------- SYNCSLOTS SNAPSHOT-EOF ------------>| | | | | |<----------- SYNCSLOTS REQUEST-PAUSE -----------| | | | | |~~~~~~~~~~~~ incremental changes ~~~~~~~~~~~~~~>| | | |~~~~~~ forward changes ~~~~~~~~~>| |--------------- SYNCSLOTS PAUSED -------------->| | | | | |<---------- SYNCSLOTS REQUEST-FAILOVER ---------| | | | | |---------- SYNCSLOTS FAILOVER-GRANTED --------->| | | | | | (performs takeover & | | propagates topology) | | | | | |------- SYNCSLOTS FINISH ------->| (finds out about topology | | change & marks migration done) | | | | | ``` ## Failure Case ``` Source Target Target Replica | | | |------------ SYNCSLOTS ESTABLISH -------------->| | | |----- SYNCSLOTS ESTABLISH ------>| |<-------------------- +OK ----------------------| | ... ... ... | | | | <FAILURE> | | | | | (performs cleanup) | | | ~~~~~~ UNLINK <key> ... ~~~~~~~>| | | | | | ------ SYNCSLOTS FINISH ------->| | | | ``` ## Full Sync, Partial Sync, and RDB In order to ensure replicas that resync during the import are still aware of the import, the slot import is serialized to a new `cluster-slot-imports` aux field. The encoding includes the job name, the source node name, and the slot ranges being imported. Upon loading an RDB with the `cluster-slot-imports` aux field, replicas will add a new migration in the `SLOT_IMPORT_OCCURRING_ON_PRIMARY` state. It's important to note that a previously saved RDB file can be used as the basis for partial sync with a primary. Because of this, whenever we load an RDB file with the `cluster-slot-imports` aux field, even from disk, we will still add a new migration to track the import. If after loading the RDB, the Valkey node is a primary, it will cancel the slot migration. Having this tracking state loaded on primaries will ensure that replicas partial syncing to a restarted primary still get their `SYNCSLOTS FINISH` message in the replication stream. ## AOF Since AOF cannot be used as the basis for a partial sync, we don't necessarily need to persist the `SYNCSLOTS ESTABLISH` and `FINISH` commands to the AOF. However, considering there is work to change this (valkey-io#59 valkey-io#1901) this design doesn't make any assumptions about this. We will propagate the `ESTABLISH` and `FINISH` commands to the AOF, and ensure that they can be properly replayed on AOF load to get to the right state. Similar to RDB, if there are any pending "ESTABLISH" commands that don't have a "FINISH" afterwards upon becoming primary, we will make sure to fail those in `verifyClusterConfigWithData`. Additionally, there was a bug in the existing slot migration where slot import clients were not having their commands persisted to AOF. This has been fixed by ensuring we still propagate to AOF even for slot import clients. ## Promotion & Demotion Since the primary is solely responsible for cleaning up unowned slots, primaries that are demoted will not clean up previously active slot imports. The promoted replica will be responsible for both cleaning up the slot (`verifyClusterConifgWithData`) and sending a `SYNCSLOTS FINISH`. # Other Options Considered I also considered tracking "dirty" slots rather than using the slot import state machine. In this setup, primaries and replicas would simply mark each slot's hashtable in the kvstore as dirty when something is written to it and we do not currently own that slot. This approach is simpler, but has a problem in that modules loaded on the replica would still not get slot migration start/end notifications. If the modules on the replica do not get such notifications, they will not be able to properly contain these dirty keys during slot migration events. --------- Signed-off-by: Jacob Murphy <jkmurphy@google.com>
|
@soloestoy @RayaCoo Any update on the progress of this? |
18f1bf9 to
2ecea1a
Compare
Signed-off-by: Ray Cao <zisong.cw@alibaba-inc.com>
2ecea1a to
ae1941d
Compare
Good point. In current core logic, So the reuse path is safe for core behavior. If any external tool currently depends on
I'd prefer to keep it simple right now. What do you think? |
Sorry for the long delay! I've rebased the PR onto the latest unstable branch and reworked the implementation to adapt to the upstream refactoring. @soloestoy @arthurkiller could you please take another look? |
Signed-off-by: Ray Cao <zisong.cw@alibaba-inc.com>
7f4874e to
5370cde
Compare
There was a problem hiding this comment.
mostly LGTM, do you like to take a look? @zuiderkwast
Not really. :) I have looked briefly but I don't know the AOF very well. I want to merge this so it can be included in 9.1. |
- Fix release date to Mon Mar 18 2026 - Fix typos: duplicate 'load', 'keyes' -> 'keys', duplicate 'INFO' - Remove reverted contributor (arshidkv12, valkey-io#3137) - Add 7 new release-notes entries from upstream/unstable merge: CLUSTERSCAN (valkey-io#2934), MSETEX (valkey-io#3121), availability-zone (valkey-io#3156), stream range optimization (valkey-io#3002), RDB as AOF preamble (valkey-io#1901), unsigned 64-bit module config (valkey-io#1546), fast_float -> ffc (valkey-io#3329) Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
…1901) When a replica finishes a **disk-based** full sync and both `appendonly yes` and `aof-use-rdb-preamble yes` are enabled, this PR reuses the received `dump.rdb` as the new AOF base file, instead of triggering `BGREWRITEAOF`. This avoids generating an almost identical base snapshot and removes redundant CPU/IO work. Before: 1. Replica receives RDB via disk-based full sync 2. Replica loads RDB into memory 3. Replica may delete synced RDB (`rdb-del-sync-files`) 4. `BGREWRITEAOF` runs and generates a new base snapshot 5. New snapshot becomes AOF base After: 1. Replica receives RDB via disk-based full sync 2. Replica loads RDB into memory 3. Synced RDB is renamed to the new AOF base file 4. A new incremental AOF is created 5. No `BGREWRITEAOF` is needed Tests: - [x] Disk-based sync + preamble on: RDB reused as AOF base - [x] New writes after sync: incr AOF works correctly - [x] Replica restart: reused AOF loads correctly - [x] Disk-based sync + preamble off: fallback to `BGREWRITEAOF` - [x] Diskless sync + preamble on: fallback to `BGREWRITEAOF` - [x] Diskless sync with stale local `dump.rdb`: stale file is not reused --------- Signed-off-by: Ray Cao <zisong.cw@alibaba-inc.com>
Summary
When a replica finishes a disk-based full sync and both
appendonly yesandaof-use-rdb-preamble yesare enabled, this PR reuses the receiveddump.rdbas the new AOF base file, instead of triggering
BGREWRITEAOF.This avoids generating an almost identical base snapshot and removes redundant CPU/IO work.
Before
rdb-del-sync-files)BGREWRITEAOFruns and generates a new base snapshotAfter
BGREWRITEAOFis neededTests
BGREWRITEAOFBGREWRITEAOFdump.rdb: stale file is not reused