Import-mode: Avoid expiration and eviction during data syncing#1185
Conversation
52be593 to
b23b09b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #1185 +/- ##
============================================
- Coverage 70.65% 70.57% -0.09%
============================================
Files 114 115 +1
Lines 61799 63171 +1372
============================================
+ Hits 43664 44582 +918
- Misses 18135 18589 +454
|
|
good work! @valkey-io/core-team please take a look |
zuiderkwast
left a comment
There was a problem hiding this comment.
I like it. I think it's a little bit strange, but with some better documentation it's not too strange.
We need to have good support for migration tools, especially for users who want to migrate from proprietary software to open source software. 😄
Redis already have a solution.
I remember this discussion from the Redis times. Do you know what Redis solution is (the same REPLCONF PSEUDO-MASTER?) or is it secret?
Any better idea?
Have you considered the idea to let RediShake act as a primary and let the target database replicate from RediShake? It can act as a replication proxy?
+-----------+ PSYNC +-----------+ PSYNC +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+ +-----------+ +--------+
interesting, and we have also considered such a method, but there are several issues. First, as a cloud provider, we do not allow an instance to become a replica for external instances. This is a very risky operation, especially since the primary connection uses a super user, which has excessive permissions. Additionally, the replica needs to establish an outbound connection, this is also not allowed. These restrictions, I believe, are not unique to cloud providers, many users' security control policies also prohibit such actions. Another point is that, in a cluster mode, the source and target instances for migration typically have different slot distributions, and redisShake can help with correctly routing the data. |
soloestoy
left a comment
There was a problem hiding this comment.
Moreover, I think we should only allow write commands for pseudo-master client when server is in pseudo-replica mode.
It's secret. I used to push this PR to Redis Community(#13077) and they say they will PR their implementation. But no news after that.
What @soloestoy said is a major limitation for us. BTW, sometimes the destination already has some data, |
|
i did not read it carefully, internally we simply pause the expiration on both side i guess. |
Interesting. I think this PR can solve that problem if the server has no other read. If the server has other read, one possible solution is to disable all expiration in |
|
@lyq2333 Can you commit the changes to When you run make locally and you have python3 installed, make updates commands.def if there are any new or changed commands. |
Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
…mand Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com>
0655bbc to
7883e9f
Compare
@zuiderkwast Thanks. Sorry forgot it before. I also forgot to sign off and have to force push. No code is modified. |
|
Weekly core meeting. No specific comments other than we should review this offline and make progress. Directionally seems like a good idea. |
|
The scenario we encountered is that some users want to migrate data from Server A to Server B. Both A and B work as primary and B may have some data before the migration. We find expiration and eviction may lead to data inconsistency. So we came up with a simple method, but it still has a few places to discuss. We plan to introduce a new config(import-mode in this PR) to mark that this server is importing data. As for expiration, in import mode, active expiration in cron is prohibited and passive expiration in commands is limited depend on the client state. Clients marked as import source work like
As for eviction, should server disable eviction automatically to ensure data consistency in import mode?Or let users choose the maxmeory-policy, I guess no one will choose an option other than @valkey-io/core-team WDYT? Any suggestions would be greatly appreciated. |
The core issue is that for the target node of data migration that is active, we need to provide normal read and write services. However, we don't want to affect the data being migrated and can't identify which data is being migrated. Method 3 seems to be a relatively suitable option, but of course, it requires users to ensure they do not perform write operations on the data being migrated. |
That's right. It would be good to mention this in the docs. It wouldn't be very secure if a client could get superuser permissions just by sending the command |
|
Looking forward to a doc PR for
In the migration guide, I hope we can have some guide about how to migrate data with import-mode, for example using redishake. |
@suxb201 hi, I think this pr is helpful for RedisShake. PTAL. |
hpatro
left a comment
There was a problem hiding this comment.
my 2 cents: easy to introduce bug in the future where a dev can forget to check the import mode flag. It would be nice to introduce a macro for serverInImportMode / serverInDataServingMode and use it everywhere instead of checking if it's primary and in/not-in import mode independently.
After valkey-io#1185, a client in import-source state can visit expired key both in read commands and write commands, this commit handle keyIsExpired function to handle import-source state as well, so KEYS can visit the expired key. This is not particularly important, but it ensures the definition, also doing some cleanup around the test, verified that the client can indeed visit the expired key. Signed-off-by: Binbin <binloveplay1314@qq.com>
|
@hpatro @enjoy-binbin Sorry for merging too fast. I didn't know you were still reviewing and testing this. |
no worry. i mute the notification at the very first beginning, and only review the code today (review and play with it while dealing the conflicts), the code LGTM overall. |
After #1185, a client in import-source state can visit expired key both in read commands and write commands, this commit handle keyIsExpired function to handle import-source state as well, so KEYS can visit the expired key. This is not particularly important, but it ensures the definition, also doing some cleanup around the test, verified that the client can indeed visit the expired key. Signed-off-by: Binbin <binloveplay1314@qq.com>
- Add new flag "I" in `CLIENT LIST` for import-source client - Add `DEBUG_CONFIG` for import-mode - Allow import-source status to be turned off when import-mode is off Fixes #1350 and #1185 (comment). --------- Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com> Signed-off-by: Yanqi Lv <lvyanqi.lyq@alibaba-inc.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Binbin <binloveplay1314@qq.com>
- Add new flag "I" in `CLIENT LIST` for import-source client - Add `DEBUG_CONFIG` for import-mode - Allow import-source status to be turned off when import-mode is off Fixes valkey-io#1350 and valkey-io#1185 (comment). --------- Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com> Signed-off-by: Yanqi Lv <lvyanqi.lyq@alibaba-inc.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Binbin <binloveplay1314@qq.com>
This PR is for valkey-io/valkey#1185 and valkey-io/valkey#1398. --------- Signed-off-by: lvyanqi.lyq <lvyanqi.lyq@alibaba-inc.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
When import-mode is yes, we might be able to set an expired TTL. At the
same time,
commands like EXPIREAT/EXPIRE do not restrict TTL from being negative.
After we
set import-mode to no, server will crash at:
```
int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) {
long long t = objectGetExpire(val);
serverAssert(t >= 0);
```
In this case, we restrict ttl from being negative in
expireGenericCommand, we simply
change the expiration time to 0 to mark the key as expired since in
import-mode, the
import-source client can always read the expired keys anyway.
import-mode was introduced in #1185
---------
Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
When import-mode is yes, we might be able to set an expired TTL. At the
same time,
commands like EXPIREAT/EXPIRE do not restrict TTL from being negative.
After we
set import-mode to no, server will crash at:
```
int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) {
long long t = objectGetExpire(val);
serverAssert(t >= 0);
```
In this case, we restrict ttl from being negative in
expireGenericCommand, we simply
change the expiration time to 0 to mark the key as expired since in
import-mode, the
import-source client can always read the expired keys anyway.
import-mode was introduced in #1185
---------
Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
When import-mode is yes, we might be able to set an expired TTL. At the
same time,
commands like EXPIREAT/EXPIRE do not restrict TTL from being negative.
After we
set import-mode to no, server will crash at:
```
int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) {
long long t = objectGetExpire(val);
serverAssert(t >= 0);
```
In this case, we restrict ttl from being negative in
expireGenericCommand, we simply
change the expiration time to 0 to mark the key as expired since in
import-mode, the
import-source client can always read the expired keys anyway.
import-mode was introduced in #1185
---------
Signed-off-by: cjx-zar <jxchenczar@foxmail.com>
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
…ey-io#2944) When import-mode is yes, we might be able to set an expired TTL. At the same time, commands like EXPIREAT/EXPIRE do not restrict TTL from being negative. After we set import-mode to no, server will crash at: ``` int activeExpireCycleTryExpire(serverDb *db, robj *val, long long now, int didx) { long long t = objectGetExpire(val); serverAssert(t >= 0); ``` In this case, we restrict ttl from being negative in expireGenericCommand, we simply change the expiration time to 0 to mark the key as expired since in import-mode, the import-source client can always read the expired keys anyway. import-mode was introduced in valkey-io#1185 --------- Signed-off-by: cjx-zar <jxchenczar@foxmail.com> Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
…y#1185) See valkey-io/valkey#1185 and related PRs for more details. Cherry-pick from Valkey. New config: `import-mode (yes|no)` New command: `CLIENT IMPORT-SOURCE (ON|OFF)` The config, when set to `yes`, disables eviction and deletion of expired keys, except for commands coming from a client which has marked itself as an import-source, the data source when importing data from another node, using the CLIENT IMPORT-SOURCE command. When we sync data from the source Valkey to the destination Valkey using some sync tools like [redis-shake](https://github.com/tair-opensource/RedisShake), the destination Valkey can perform expiration and eviction, which may cause data corruption. This problem has been discussed in redis#9760 (reply in thread) and Redis already have a solution. But in Valkey we haven't fixed it by now. E.g. we call `set key 1 ex 1` on the source server and transfer this command to the destination server. Then we call `incr key` on the source server before the key expired, we will have a key on the source server with a value of 2. But when the command arrived at the destination server, the key may be expired and has deleted. So we will have a key on the destination server with a value of 1, which is inconsistent with the source server. In standalone mode, we can use writable replica to simplify the sync process. However, in cluster mode, we still need a sync tool to help us transfer the source data to the destination. The sync tool usually work as a normal client and the destination works as a primary which keep expiration and eviction. In this PR, we add a new mode named 'import-mode'. In this mode, server stop expiration and eviction just like a replica. Notice that this mode exists only in sync state to avoid data inconsistency caused by expiration and eviction. Import mode only takes effect on the primary. Sync tools can mark their clients as an import source by `CLIENT IMPORT-SOURCE`, which work like a client from primary and can visit expired keys in `lookupkey`. **Notice: during the migration, other clients, apart from the import source, should not access the data imported by import source.**
New config:
import-mode (yes|no)New command:
CLIENT IMPORT-SOURCE (ON|OFF)The config, when set to
yes, disables eviction and deletion of expired keys, except for commands coming from a client which has marked itself as an import-source, the data source when importing data from another node, using the CLIENT IMPORT-SOURCE command.When we sync data from the source Valkey to the destination Valkey using some sync tools like redis-shake, the destination Valkey can perform expiration and eviction, which may cause data corruption. This problem has been discussed in redis/redis#9760 (reply in thread) and Redis already have a solution. But in Valkey we haven't fixed it by now.
E.g. we call
set key 1 ex 1on the source server and transfer this command to the destination server. Then we callincr keyon the source server before the key expired, we will have a key on the source server with a value of 2. But when the command arrived at the destination server, the key may be expired and has deleted. So we will have a key on the destination server with a value of 1, which is inconsistent with the source server.In standalone mode, we can use writable replica to simplify the sync process. However, in cluster mode, we still need a sync tool to help us transfer the source data to the destination. The sync tool usually work as a normal client and the destination works as a primary which keep expiration and eviction.
In this PR, we add a new mode named 'import-mode'. In this mode, server stop expiration and eviction just like a replica. Notice that this mode exists only in sync state to avoid data inconsistency caused by expiration and eviction. Import mode only takes effect on the primary. Sync tools can mark their clients as an import source by
CLIENT IMPORT-SOURCE, which work like a client from primary and can visit expired keys inlookupkey.Notice: during the migration, other clients, apart from the import source, should not access the data imported by import source.