Skip to content

Conversation

@guybe7
Copy link
Collaborator

@guybe7 guybe7 commented Jun 21, 2023

Apart from adding the missing coverage, this PR also adds blockedBeforeSleep that gathers all block-related functions from beforeSleep

The order inside blockedBeforeSleep is different: now handleClientsBlockedOnKeys (which may unblock clients) is called before processUnblockedClients (which handles unblocked clients).
It makes sense to have this order.

There are no visible effects of the wrong ordering, except some cleanups of the now-unblocked client would have happen in the next beforeSleep (will now happen in the current one)

The reason we even got into it is because i triggers an assertion in logresreq.c (breaking the assumption that unblockClient is called before actually flushing the reply to the socket):
handleClientsBlockedOnKeys is called, then it calls moduleUnblockClientOnKey, which calls moduleUnblockClient, which adds the client to moduleUnblockedClients
back to beforeSleep, we call handleClientsWithPendingWritesUsingThreads, it writes the data of buf to the client, so client->bufpos became 0
On the next beforeSleep, we call moduleHandleBlockedClients, which calls unblockClient, which calls reqresAppendResponse, triggering the assert. (because the bufpos is 0) - see #12301 (comment)

@oranagra
Copy link
Member

@guybe7 can you add a description of the outcome of the bug this PR fixes (for plain users) to the top comment?
i.e. putting aside the reqres assertion, does it simply means the unblocked client will be handled in the next event loop (possibly waiting for cron trigger if there's no traffic)?
does it only affect modules with blocked clients being unblocked by timers, or also affects more common workloads?

@guybe7 guybe7 requested a review from oranagra June 22, 2023 13:39
@oranagra oranagra merged commit 3230199 into redis:unstable Jun 22, 2023
Comment on lines +1660 to +1664
/* Handle blocked clients.
* must be done before flushAppendOnlyFile, in case of appendfsync=always,
* since the unblocked clients may write data. */
blockedBeforeSleep();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see in below, we have this (line 1670):

    /* Call the Redis Cluster before sleep function. Note that this function
     * may change the state of Redis Cluster (from ok to fail or vice versa),
     * so it's a good idea to call it before serving the unblocked clients
     * later in this function. */
    if (server.cluster_enabled) clusterBeforeSleep();

will this (the order) be a problem?

Copy link
Contributor

@enjoy-binbin enjoy-binbin Jun 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see there is a commit that doing this: 2ecb5ed

i haven't looked into it yet, just pointing it out first.

The issue says
Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the beforeSleep() call in redis.c. It should be at the top, before processing un blocking clients. This is exactly the reason in the specific instance of the bug as reported, why the state was not updated in time before clients served.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!
i'm not certain i understand the problem, and it's a pity that there's no test, but let's move it anyway to be above the new blockedBeforeSleep().

i suppose that by doing that we're assuming that blockedBeforeSleep can't change cluster related things, otherwise it's a chicken-and-egg case.

@enjoy-binbin can you make a PR to fix this?
if you happen to be able to reproduce the original problem that's great, but if not, let's just re-order it ASAP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-order it in #12343. I'll see if there's a way to reproduce it to add a test

enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jun 25, 2023
The new blockedBeforeSleep was added in redis#12337, it breaks the order in 2ecb5ed.

This may be related to redis#2288, quoted from comment in redis#2288:
```
Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the
beforeSleep() call in redis.c. It should be at the top, before processing un blocking
clients. This is exactly the reason in the specific instance of the bug as reported,
why the state was not updated in time before clients served.
```
oranagra pushed a commit that referenced this pull request Jun 25, 2023
The new blockedBeforeSleep was added in #12337, it breaks the order in 2ecb5ed.

This may be related to #2288, quoted from comment in #2288:
```
Moreover the clusterBeforeSleep() call was misplaced at the end of the chain of the
beforeSleep() call in redis.c. It should be at the top, before processing un blocking
clients. This is exactly the reason in the specific instance of the bug as reported,
why the state was not updated in time before clients served.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants