-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Fix blocking commands timeout is reset due to re-processing command #13004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix blocking commands timeout is reset due to re-processing command #13004
Conversation
In redis#11012, we will reprocess command when client is unblocked on keys, In the XREADGROUP BLOCK scenario, because of the re-processing command, we will recalculate the block timeout, causing the blocking time to be reset. This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly let the command know that it is being re-processed, later in blockForKeys we will not reset the timeout. Fixes redis#12998.
|
I'm considering adding a test, such as checking the time-consuming in redis-cli.tcl |
why is it hard to measure it in TCL with |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check if other blocking commands were affected by this bug.
i.e. WAITAOF, modules, BLPOP.
i suppose at least the list blocking commands can get unblocked and re-blocked similarly to XREAD, so we should mention them in the release notes.
|
BLPOP / BZPOPMIN have the same issue and i reproduce it in local. i think modules also have the issue, but i haven't found a test case to reproduce yet, maybe you have a idea? WAITAOF, i didn't think of a scenario where re-process would be triggered, do you have a idea? I will continue to think about it tomorrow and push the first |
|
i don't think we have to test all of them. |
… a value to c->bstate.timeout
|
There may be modules, but I haven't found any examples where we will re-process command (except blocking auth). Just in case, I also made a small change in the last commit, please take a look wait and waitaof don't seem to re-process command |
|
i think that in modules we never re-process the commands, we left it out of scope in #11012 @ranshid please ack. |
|
https://github.com/redis/redis/actions/runs/7709050607/job/21009405189#step:5:683 i see i missed something, that the ci is fails. if (keys) {
blockForKeys(c,BLOCKED_MODULE,keys,numkeys,timeout,flags&REDISMODULE_BLOCK_UNBLOCK_DELETED);
} else {
+ c->bstate.timeout = timeout;
blockClient(c,BLOCKED_MODULE);
}@oranagra |
|
@enjoy-binbin yes, please fix it in this way. |
yes, now it is flaky, i'll try to get it to stabilize and then i will issue the fix. |
This was introduced in redis#13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ```
This was introduced in #13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut -- the timeout_callback is NULL, invalidFunctionWasCalled bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ```
…edis#13004) In redis#11012, we will reprocess command when client is unblocked on keys, in some blocking commands, for example, in the XREADGROUP BLOCK scenario, because of the re-processing command, we will recalculate the block timeout, causing the blocking time to be reset. This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly let the command know that it is being re-processed, later in blockForKeys we will not reset the timeout. Affected BLOCK cases: - list / zset / stream, added test cases for each. Unaffected cases: - module (never re-process the commands). - WAIT / WAITAOF (never re-process the commands). Fixes redis#12998.
This was introduced in redis#13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut -- the timeout_callback is NULL, invalidFunctionWasCalled bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ```
These tests have all failed in daily CI:
```
*** [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl
Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test)
*** [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl
Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test)
*** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl
Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test)
```
Increase the range to avoid failures, tests was introduced in redis#13004.
…ues (#13053) These tests have all failed in daily CI: ``` *** [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) ``` Increase the range to avoid failures, and improve the comment to be clearer. tests was introduced in #13004.
…edis#13004) In redis#11012, we will reprocess command when client is unblocked on keys, in some blocking commands, for example, in the XREADGROUP BLOCK scenario, because of the re-processing command, we will recalculate the block timeout, causing the blocking time to be reset. This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly let the command know that it is being re-processed, later in blockForKeys we will not reset the timeout. Affected BLOCK cases: - list / zset / stream, added test cases for each. Unaffected cases: - module (never re-process the commands). - WAIT / WAITAOF (never re-process the commands). Fixes redis#12998. (cherry picked from commit 492021d)
This was introduced in redis#13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut -- the timeout_callback is NULL, invalidFunctionWasCalled bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ``` (cherry picked from commit 45a35a7)
…ues (redis#13053) These tests have all failed in daily CI: ``` *** [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) ``` Increase the range to avoid failures, and improve the comment to be clearer. tests was introduced in redis#13004. (cherry picked from commit 32f44da)
…13004) In #11012, we will reprocess command when client is unblocked on keys, in some blocking commands, for example, in the XREADGROUP BLOCK scenario, because of the re-processing command, we will recalculate the block timeout, causing the blocking time to be reset. This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly let the command know that it is being re-processed, later in blockForKeys we will not reset the timeout. Affected BLOCK cases: - list / zset / stream, added test cases for each. Unaffected cases: - module (never re-process the commands). - WAIT / WAITAOF (never re-process the commands). Fixes #12998. (cherry picked from commit 492021d)
This was introduced in #13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut -- the timeout_callback is NULL, invalidFunctionWasCalled bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ``` (cherry picked from commit 45a35a7)
…ues (#13053) These tests have all failed in daily CI: ``` *** [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) ``` Increase the range to avoid failures, and improve the comment to be clearer. tests was introduced in #13004. (cherry picked from commit 32f44da)
…edis#13004) In redis#11012, we will reprocess command when client is unblocked on keys, in some blocking commands, for example, in the XREADGROUP BLOCK scenario, because of the re-processing command, we will recalculate the block timeout, causing the blocking time to be reset. This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly let the command know that it is being re-processed, later in blockForKeys we will not reset the timeout. Affected BLOCK cases: - list / zset / stream, added test cases for each. Unaffected cases: - module (never re-process the commands). - WAIT / WAITAOF (never re-process the commands). Fixes redis#12998.
This was introduced in redis#13004, missing this assignment. It causes timeout to be a random value (may be less than now), and then in `Unblock by timer` test, the client is unblocked and then it call timeout_callback, since the callback is NULL, the server will crash. The crash stack is: ``` beforesleep handleBlockedClientsTimeout checkBlockedClientTimeout unblockClientOnTimeout replyToBlockedClientTimedOut moduleBlockedClientTimedOut -- the timeout_callback is NULL, invalidFunctionWasCalled bc->timeout_callback(&ctx,(void**)c->argv,c->argc); ```
…ues (redis#13053) These tests have all failed in daily CI: ``` *** [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) *** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test) ``` Increase the range to avoid failures, and improve the comment to be clearer. tests was introduced in redis#13004.
In #11012, we will reprocess command when client is unblocked on keys,
in some blocking commands, for example, in the XREADGROUP BLOCK scenario,
because of the re-processing command, we will recalculate the block timeout,
causing the blocking time to be reset.
This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly
let the command know that it is being re-processed, later in blockForKeys
we will not reset the timeout.
Affected BLOCK cases:
Unaffected cases:
Fixes #12998.