Skip to content

Conversation

@li-benson
Copy link
Contributor

The bug mentioned in this #13862 has been fixed.

Signed-off-by: li-benson <1260437731@qq.com>
@CLAassistant
Copy link

CLAassistant commented Mar 15, 2025

CLA assistant check
All committers have signed the CLA.

key = dictGetKey(de);
keyobj = createStringObject(key,sdslen(key));
if (allvolatile && server.masterhost && --maxtries == 0) {
if (allvolatile && (server.masterhost || isPausedActions(PAUSE_ACTION_EXPIRE)) && --maxtries == 0) {
Copy link
Collaborator

@sundb sundb Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this fix complete fixes this dead loop, PAUSE_ACTION_EXPIRE just exacerbates the situation, it is still possible for a loop to occur in the master instance in the edge case, not sure if we also need to do the maxtries limit on the master instance and return empty when maxtries is reached.
ping @oranagra

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a known issue, in normal master, it'll delete the expired keys until none are left and it finds the non-expired one.
i think behaving as a replica in this case is sensible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expected that the randomkey obtains an expired key?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wclmxxs yes, in this case we have to return a key that had expired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wclmxxs yes, in this case we have to return a key that had expired.

I get it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wclmxxs its behavior is similar to replica, so I'm okay to return an expired key.

But it's not intuitive.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have to consider the other extreme case, if there is just one unexpired key that is not hit, then returning NULL may cause the user to think that the db is empty, but there is still one left.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wclmxxs its behavior is similar to replica, so I'm okay to return an expired key.

But it's not intuitive.

The function of the randomkey command is to return a random key. We should try our best to return a key, even if it has expired. If a nil is returned, it will confuse the user: clearly there are keys in Redis, but why is a nil returned? In my opinion, returning an expired key is more appropriate than returning a nil.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wclmxxs its behavior is similar to replica, so I'm okay to return an expired key.

But it's not intuitive.

The function of the randomkey command is to return a random key. We should try our best to return a key, even if it has expired. If a nil is returned, it will confuse the user: clearly there are keys in Redis, but why is a nil returned? In my opinion, returning an expired key is more appropriate than returning a nil.

It's intuitive to expire when the time is up and delete when the time is up, but now I think it's an implementation trade-off.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have to consider the other extreme case, if there is just one unexpired key that is not hit, then returning NULL may cause the user to think that the db is empty, but there is still one left.

ok i get it

@sundb sundb added this to Redis 8.2 Mar 17, 2025
@sundb sundb added the release-notes indication that this issue needs to be mentioned in the release notes label Mar 17, 2025
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
@sundb sundb merged commit 427c368 into redis:unstable Mar 20, 2025
18 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in Redis 8.2 Mar 20, 2025
@sundb
Copy link
Collaborator

sundb commented Mar 24, 2025

@li-benson we still need a specific fix for 7.4 and 6.2, because PAUSE_ACTION_EXPIRE was introduced after 7.4, can you handle it? Prepare a PR that merges to 7.4 and then mark it as backport to 6.2.

@li-benson
Copy link
Contributor Author

@li-benson we still need a specific fix for 7.4 and 6.2, because PAUSE_ACTION_EXPIRE was introduced after 7.4, can you handle it? Prepare a PR that merges to 7.4 and then mark it as backport to 6.2.

OK, I will handle it tomorrow. Thanks for the reminder.

@youngmore1024
Copy link
Contributor

@li-benson we still need a specific fix for 7.4 and 6.2, because PAUSE_ACTION_EXPIRE was introduced after 7.4, can you handle it? Prepare a PR that merges to 7.4 and then mark it as backport to 6.2.

@sundb I'm sorry for submitting the PR so late. First of all, I found that the PAUSE_ACTION_EXPIRE has been supported since version 7.2. Additionally, there is an incompatibility in the coding style between the versions after 7.2 and those before 7.2. Therefore, I fixed this issue based on version 7.0 (pr:#13891), and this fix can be backported to version 6.2 , while the previous fix could be backported to version 7.2.

Comment on lines +372 to +376
wait_for_condition 50 100 {
[r randomkey] == "key"
} else {
fail "execute randomkey failed, caused by the infinite loop"
}
Copy link
Member

@oranagra oranagra Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this wait_for_condition isn't necessary, right?
in the past, a single execution of RANDOMKEY would have hang, and now it'll return "key" (but we'd also be ok if it returned {}.
i.e. in theory, this can be changed to: ?

Suggested change
wait_for_condition 50 100 {
[r randomkey] == "key"
} else {
fail "execute randomkey failed, caused by the infinite loop"
}
set key [r randomkey]
assert {$key == "key" || $key == ""}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the first randomkey will stuck Redis. and your solution is also reasonable.

YaacovHazan pushed a commit to YaacovHazan/redis that referenced this pull request Apr 22, 2025
…3863)

The bug mentioned in this
[redis#13862](redis#13862) has been fixed.

---------

Signed-off-by: li-benson <1260437731@qq.com>
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
Co-authored-by: youngmore1024 <youngmore1024@outlook.com>
YaacovHazan pushed a commit to YaacovHazan/redis that referenced this pull request Apr 22, 2025
…3863)

The bug mentioned in this
[redis#13862](redis#13862) has been fixed.

---------

Signed-off-by: li-benson <1260437731@qq.com>
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
Co-authored-by: youngmore1024 <youngmore1024@outlook.com>
@YaacovHazan YaacovHazan moved this from Todo to Done in Redis 7.4 Backport Apr 29, 2025
@YaacovHazan YaacovHazan moved this from Todo to Done in Redis 7.2 Backport Apr 29, 2025
@sundb
Copy link
Collaborator

sundb commented Jul 8, 2025

@YaacovHazan mark this as 6.2 backport, it will be conflict, please refer to #13893 to fix the conflict.

@sundb sundb added this to Redis 8.0 Aug 15, 2025
@sundb sundb removed this from Redis 8.2 Aug 15, 2025
@sundb sundb moved this from Todo to Done in Redis 8.0 Aug 15, 2025
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
…3863)

The bug mentioned in this
[redis#13862](redis#13862) has been fixed.

---------

Signed-off-by: li-benson <1260437731@qq.com>
Signed-off-by: youngmore1024 <youngmore1024@outlook.com>
Co-authored-by: youngmore1024 <youngmore1024@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes

Projects

Status: To Do
Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants