-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Fix potential infinite loop of RANDOMKEY during client pause #13863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: li-benson <1260437731@qq.com>
| key = dictGetKey(de); | ||
| keyobj = createStringObject(key,sdslen(key)); | ||
| if (allvolatile && server.masterhost && --maxtries == 0) { | ||
| if (allvolatile && (server.masterhost || isPausedActions(PAUSE_ACTION_EXPIRE)) && --maxtries == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this fix complete fixes this dead loop, PAUSE_ACTION_EXPIRE just exacerbates the situation, it is still possible for a loop to occur in the master instance in the edge case, not sure if we also need to do the maxtries limit on the master instance and return empty when maxtries is reached.
ping @oranagra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a known issue, in normal master, it'll delete the expired keys until none are left and it finds the non-expired one.
i think behaving as a replica in this case is sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it expected that the randomkey obtains an expired key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wclmxxs yes, in this case we have to return a key that had expired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wclmxxs yes, in this case we have to return a key that had expired.
I get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wclmxxs its behavior is similar to replica, so I'm okay to return an expired key.
But it's not intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have to consider the other extreme case, if there is just one unexpired key that is not hit, then returning NULL may cause the user to think that the db is empty, but there is still one left.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wclmxxs its behavior is similar to replica, so I'm okay to return an expired key.
But it's not intuitive.
The function of the randomkey command is to return a random key. We should try our best to return a key, even if it has expired. If a nil is returned, it will confuse the user: clearly there are keys in Redis, but why is a nil returned? In my opinion, returning an expired key is more appropriate than returning a nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wclmxxs its behavior is similar to replica, so I'm okay to return an expired key.
But it's not intuitive.
The function of the randomkey command is to return a random key. We should try our best to return a key, even if it has expired. If a nil is returned, it will confuse the user: clearly there are keys in Redis, but why is a nil returned? In my opinion, returning an expired key is more appropriate than returning a nil.
It's intuitive to expire when the time is up and delete when the time is up, but now I think it's an implementation trade-off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have to consider the other extreme case, if there is just one unexpired key that is not hit, then returning NULL may cause the user to think that the db is empty, but there is still one left.
ok i get it
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
|
@li-benson we still need a specific fix for 7.4 and 6.2, because |
OK, I will handle it tomorrow. Thanks for the reminder. |
@sundb I'm sorry for submitting the PR so late. First of all, I found that the PAUSE_ACTION_EXPIRE has been supported since version 7.2. Additionally, there is an incompatibility in the coding style between the versions after 7.2 and those before 7.2. Therefore, I fixed this issue based on version 7.0 (pr:#13891), and this fix can be backported to version 6.2 , while the previous fix could be backported to version 7.2. |
| wait_for_condition 50 100 { | ||
| [r randomkey] == "key" | ||
| } else { | ||
| fail "execute randomkey failed, caused by the infinite loop" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this wait_for_condition isn't necessary, right?
in the past, a single execution of RANDOMKEY would have hang, and now it'll return "key" (but we'd also be ok if it returned {}.
i.e. in theory, this can be changed to: ?
| wait_for_condition 50 100 { | |
| [r randomkey] == "key" | |
| } else { | |
| fail "execute randomkey failed, caused by the infinite loop" | |
| } | |
| set key [r randomkey] | |
| assert {$key == "key" || $key == ""} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the first randomkey will stuck Redis. and your solution is also reasonable.
…3863) The bug mentioned in this [redis#13862](redis#13862) has been fixed. --------- Signed-off-by: li-benson <1260437731@qq.com> Signed-off-by: youngmore1024 <youngmore1024@outlook.com> Co-authored-by: youngmore1024 <youngmore1024@outlook.com>
…3863) The bug mentioned in this [redis#13862](redis#13862) has been fixed. --------- Signed-off-by: li-benson <1260437731@qq.com> Signed-off-by: youngmore1024 <youngmore1024@outlook.com> Co-authored-by: youngmore1024 <youngmore1024@outlook.com>
|
@YaacovHazan mark this as |
…3863) The bug mentioned in this [redis#13862](redis#13862) has been fixed. --------- Signed-off-by: li-benson <1260437731@qq.com> Signed-off-by: youngmore1024 <youngmore1024@outlook.com> Co-authored-by: youngmore1024 <youngmore1024@outlook.com>
The bug mentioned in this #13862 has been fixed.