Fix: Prevent routing reads to Redis slave nodes in loading state#3370
Fix: Prevent routing reads to Redis slave nodes in loading state#3370
Conversation
|
@ofekshenawa I am not sure (was not able to find information) if |
|
Update: |
ndyakov
left a comment
There was a problem hiding this comment.
This will work, but we should find a good way to test it. Let's add a note about it.
|
Won't this perform a PING in the critical path of execution of any command against a slave node? The LOADING error status should be caught lazily and cached in-memory, bounded by a penalty timeout, mirroring the implementation of We are observing more than a 2x latency regression on v9.9.0, which includes this commit, when slave routing is enabled. @ndyakov I think it would be a good idea to issue a patch release either with this commit reverted, or the behavior fixed forward as suggested above. |
|
@ndyakov Just like the comment above, in the project I'm working on we're pushing thousands of commands through a pipeline, and we've noticed that latency has more than doubled. This change really needs to be rolled back and handled differently, as sending a |
|
@LINKIWI , @ste93cry thank you for reporting this. The loaded state should / can be cached, you are both correct. @ste93cry for the time being, feel free to use a release before this patch, I will work on adding a cache and will release a new patch soon. Let's assume we add a |
|
Thanks for following up. Regarding when the cached loaded state should return to 0: In most cases/for most users, I suspect slaves should only be in the A separate follow up here could be to transparently fail over command execution to another available slave after observing |
|
@LINKIWI I think introducing a timeout will require a more testing to calculate a good default value of the timeout. Here is a naive approach without a timeout, let's continue the discussion there. I would prefer to have the naive approach merged and released and then, once we are the time we can find a better approach #3410 |
Co-authored-by: Nedyalko Dyakov <1547186+ndyakov@users.noreply.github.com>
Co-authored-by: Nedyalko Dyakov <1547186+ndyakov@users.noreply.github.com>

Problem
When a new slave node is added to a Redis cluster, it enters a loading state while it synchronizes data from its master. If the client has
ReadOnly=trueenabled, it may route read operations to this loading slave node, which can result in nil values or errors being returned to the application.closes #3222
Solution
This PR adds logic to detect when a slave node is in the loading state (by checking for LOADING errors) and temporarily excludes such nodes from the read distribution until they're ready to serve requests.