-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Modules: Unblock from within a timer coverage #12337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@guybe7 can you add a description of the outcome of the bug this PR fixes (for plain users) to the top comment? |
| /* Handle blocked clients. | ||
| * must be done before flushAppendOnlyFile, in case of appendfsync=always, | ||
| * since the unblocked clients may write data. */ | ||
| blockedBeforeSleep(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see in below, we have this (line 1670):
/* Call the Redis Cluster before sleep function. Note that this function
* may change the state of Redis Cluster (from ok to fail or vice versa),
* so it's a good idea to call it before serving the unblocked clients
* later in this function. */
if (server.cluster_enabled) clusterBeforeSleep();
will this (the order) be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see there is a commit that doing this: 2ecb5ed
i haven't looked into it yet, just pointing it out first.
The issue says
Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the beforeSleep() call in redis.c. It should be at the top, before processing un blocking clients. This is exactly the reason in the specific instance of the bug as reported, why the state was not updated in time before clients served.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
i'm not certain i understand the problem, and it's a pity that there's no test, but let's move it anyway to be above the new blockedBeforeSleep().
i suppose that by doing that we're assuming that blockedBeforeSleep can't change cluster related things, otherwise it's a chicken-and-egg case.
@enjoy-binbin can you make a PR to fix this?
if you happen to be able to reproduce the original problem that's great, but if not, let's just re-order it ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-order it in #12343. I'll see if there's a way to reproduce it to add a test
The new blockedBeforeSleep was added in redis#12337, it breaks the order in 2ecb5ed. This may be related to redis#2288, quoted from comment in redis#2288: ``` Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the beforeSleep() call in redis.c. It should be at the top, before processing un blocking clients. This is exactly the reason in the specific instance of the bug as reported, why the state was not updated in time before clients served. ```
The new blockedBeforeSleep was added in #12337, it breaks the order in 2ecb5ed. This may be related to #2288, quoted from comment in #2288: ``` Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the beforeSleep() call in redis.c. It should be at the top, before processing un blocking clients. This is exactly the reason in the specific instance of the bug as reported, why the state was not updated in time before clients served. ```
Apart from adding the missing coverage, this PR also adds
blockedBeforeSleepthat gathers all block-related functions frombeforeSleepThe order inside
blockedBeforeSleepis different: nowhandleClientsBlockedOnKeys(which may unblock clients) is called beforeprocessUnblockedClients(which handles unblocked clients).It makes sense to have this order.
There are no visible effects of the wrong ordering, except some cleanups of the now-unblocked client would have happen in the next
beforeSleep(will now happen in the current one)The reason we even got into it is because i triggers an assertion in logresreq.c (breaking the assumption that
unblockClientis called before actually flushing the reply to the socket):handleClientsBlockedOnKeysis called, then it callsmoduleUnblockClientOnKey, which callsmoduleUnblockClient, which adds the client tomoduleUnblockedClientsback to
beforeSleep, we callhandleClientsWithPendingWritesUsingThreads, it writes the data of buf to the client, soclient->bufposbecame 0On the next
beforeSleep, we callmoduleHandleBlockedClients, which callsunblockClient, which callsreqresAppendResponse, triggering the assert. (because thebufposis 0) - see #12301 (comment)