Skip to content

Conversation

@kingpeterpaule
Copy link
Contributor

Consider following commands order:

  1. set key 10
  2. expire key 120

Usually the master and slave have different expire time point and the differ should be very small. But If the expire command is not replicated to slave and some network problem happens or slave down.
When slave reboot, it send psync2 to master, and psync is accepted. In this scenario, the differ may be large and slave have newer expire time point. More serious, if failover happens, the slave becomes new master, a user may find that a already expired key which is in old master that could visit in new master.

Data inconsistent happens.

@trevor211
Copy link
Contributor

trevor211 commented Jul 26, 2018

I added a test, it passed without your PR.

trevor211 added a commit to trevor211/redis that referenced this pull request Jul 27, 2018
@trevor211
Copy link
Contributor

trevor211 commented Jul 27, 2018

Forgive me, my previous tcl test was wrong.
I updated it, and it supports your PR. @kingpeterpaule
Please take a look at here. @antirez

@kingpeterpaule
Copy link
Contributor Author

Hi @trevor211 thanks~~

@0xtonyxia
Copy link
Contributor

Hi @kingpeterpaule , i think you should also consider setex command and set command with ex or px option.

@antirez
Copy link
Contributor

antirez commented Jul 31, 2018

Hello and thanks for submitting this. I don't think we should fix this issue. I ping @oranagra and @soloestoy to see other POVs, but IMHO it is more important than two instances, a master and a slave, with a very different absolute clock value, are still able from the external to have more or less the same TTL visibility of a key, compared to fixing this "key reappearing" problem, that is btw not a consistency problem per se (because master and slave will have identical data here, there is no technical desync).

To better focus how this patch impacts Redis is a much more serious way that it attempts to fix problems, see this:

Master A clock is X.
Slave B clock is X+10.

I send to the master SET foo bar EX 8.

I'm using Slave B to scale reads. However GET foo in slave B will return NILL immediately.

This is a far bigger problem than the fact that a key that expired in the master, in case of a replication link delay and a failover in the middle, may re-appear.

Note that without failover, there is no actual problem, because in the replication stream there will be immediately a DEL following, since the master expired the key.

Taking this open for further discussion but IMHO there is nothing to fix...

@oranagra
Copy link
Member

Hi,
On a first look, it feels to me that we should fix it, since it would be closer to what would happen on a full sync (the RDB carries an absolute TTL).
Personally, i would rather have the dataset and behavior of the slave on GET to be identical in PSYNC and full-sync. but this is a compromise between two things and each of us can have a different feeling.

Maybe a more serious problem here, or another view of the same / similar problem, is what happens when an AOF is loaded.
for instance, let's assume a key with TTL of 24 hours, and 23 hours after it was added the server dies and recovers from AOF... do we want the key to live for another 24 hours, or expire after an hour?

one way to look at it, may be to imagine that we respect the user's intent, so if he used EXPIRE and not EXPIREAT, the key should live for another 24 hours?
but i guess most users choose between these commands for convenience and not by special considerations.

p.s. i imagine that if we fix this, we may need to fix other commands (like RESTORE)

@soloestoy
Copy link
Contributor

I will review this PR carefully @antirez .

one way to look at it, may be to imagine that we respect the user's intent, so if he used EXPIRE and not EXPIREAT, the key should live for another 24 hours?

But I have to say that @oranagra, we rewrite EXPIRE as EXPIREAT when propagating to AOF.

@oranagra
Copy link
Member

@soloestoy i think i remember that change, but for some reason i can't find it in the code.. can you give me a hint?
anyway, if we do that for AOF, maybe that can be an indication that we should do that for slaves too?
anyway, i don't strongly feel about it either way.. it's a compromise.

@soloestoy
Copy link
Contributor

Hi @oranagra , it's function feedAppendOnlyFile() in aof.c.

    if (cmd->proc == expireCommand || cmd->proc == pexpireCommand ||
        cmd->proc == expireatCommand) {
        /* Translate EXPIRE/PEXPIRE/EXPIREAT into PEXPIREAT */
        buf = catAppendOnlyExpireAtCommand(buf,cmd,argv[1],argv[2]);

@trevor211
Copy link
Contributor

@oranagra I think it makes sense to rewrite EXPIRE as EXPIREAT when propagating to AOF but do not rewrite when propagating to slaves. When propagating to AOF, we are sure using the same clock value, but when propagating to slaves, we do not have that guarantee.

@antirez
Copy link
Contributor

antirez commented Aug 1, 2018

@oranagra AOF rewrites EXPIRE as EXPIREAT since the very start, actually EXPIREAT exists because of that.

However I think like @trevor211 that while it is sane to do that for AOF, because we want the time to logically pass when the server is restarted, on the contrary on the slave we want them to honor the same high level contract of expires that the master has, that is, we want the time to be counted logically starting from the moment the write was received. This usually provides the most coherent behavior from the point of view of the external users, with TTLs that are similar from the POV of the external observer.

Of course the issue that @kingpeterpaule points in this issue is true, but it is a corner case, not the normal operations. And such corner case is not even particularly unexpected or surprising IMHO, because in the extreme case of a failover immediately after the stream was received what happens is not an inconsistency but just a jump back in time that is proportional to the replication delay that there is between the master and slave. Now in that case of a big delay in the replication link, jump back in time is always possible after a failover.

So after thinking at the different arguments, I still think it's not the case to fix this issue. Thank you a lot for informing this discussion!

@oranagra oranagra added the state:to-be-closed requesting the core team to close the issue label Aug 31, 2020
@oranagra
Copy link
Member

oranagra commented Sep 1, 2020

Closing this due to Salvatore's post above (disadvantages outweigh the advantages)

@oranagra oranagra closed this Sep 1, 2020
oranagra added a commit that referenced this pull request Jan 19, 2021
…#8357)

This commit adds tests to make sure that relative and absolute expire commands
are propagated as is to replicas and stop any future attempt to change that without
a proper discussion. see #8327 and #5171

Additionally it slightly improve the AOF test that tests the opposite (always
propagating absolute times), by covering more commands, and shaving 2
seconds from the test time.
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Mar 2, 2021
…redis#8357)

This commit adds tests to make sure that relative and absolute expire commands
are propagated as is to replicas and stop any future attempt to change that without
a proper discussion. see redis#8327 and redis#5171

Additionally it slightly improve the AOF test that tests the opposite (always
propagating absolute times), by covering more commands, and shaving 2
seconds from the test time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

critical bug state:to-be-closed requesting the core team to close the issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants