Extend CAPA REDIRECT to cover keyless commands on cluster replicas#3505
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3505 +/- ##
============================================
+ Coverage 76.62% 76.71% +0.08%
============================================
Files 162 162
Lines 80674 80689 +15
============================================
+ Hits 61813 61897 +84
+ Misses 18861 18792 -69
🚀 New features to boost your workflow:
|
a6a9955 to
d7600b3
Compare
ranshid
left a comment
There was a problem hiding this comment.
Looks good.
The PR body still references the old name redirect-keyless and says "MOVED response". It should be updated to reflect the current implementation (primary-read, -REDIRECT).
ac63eee to
608dbaf
Compare
608dbaf to
9c9c3fc
Compare
9c9c3fc to
4ae9198
Compare
📝 WalkthroughWalkthroughReplicas now detect keyless commands from clients advertising CAPA REDIRECT and, when not READONLY, resolve the primary and reply with ChangesKeyless read redirection on cluster replicas
Sequence DiagramsequenceDiagram
participant Client
participant Replica
participant Primary
Client->>Replica: keyless command (or EXEC with no keys) with CAPA REDIRECT
Replica->>Replica: processCommand -> detect is_keyless / is_keyless_exec
Replica->>Replica: check is_replica && client_has_redirect_capa && !client_readonly
Replica->>Replica: resolve primary endpoint
Replica->>Client: -REDIRECT <primary-endpoint> (discard/flag EXEC as appropriate)
Replica->>ModuleSystem: fire command rejected event "-REDIRECT"
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/server.c`:
- Around line 4464-4466: The condition uses is_read_command (true if any queued
subcommand is readonly), causing mixed all-keyless EXEC blocks with write
commands to be treated as readonly and redirected; update the logic so we only
take the redirect path when every queued subcommand is readonly. Concretely, add
or compute a flag (e.g., is_all_readonly) that checks all queued commands for
CMD_READONLY and replace is_read_command with that flag in the if that
references is_keyless_exec, server.cluster_enabled, obey_client, (c->capa &
CLIENT_CAPA_REDIRECT) and !c->flag.readonly so only truly all-readonly keyless
EXECs hit this branch.
In `@tests/unit/cluster/replica-redirect.tcl`:
- Around line 60-63: The tests in the test block that call
valkey_deferring_client 0 assume node roles from earlier tests; make each test
self-contained by explicitly setting and/or asserting node roles at the start
(for example, call the existing helper that promotes/demotes nodes or add a
small helper proc like ensure_node_is_replica(nodeId) and
ensure_node_is_primary(nodeId)) before using valkey_deferring_client, and update
all affected test cases (the blocks around valkey_deferring_client 0 and similar
calls) to invoke that helper so each test establishes the replica/primary state
it needs rather than relying on prior test order.
- Around line 65-67: The test currently assumes SCAN returns cursor "0" by
asserting assert_match "0 *" $reply; change this to verify response shape rather
than a fixed cursor by reading the reply from $rd (the result of the SCAN
command) and asserting it matches a generic cursor+data pattern or is not an
error (e.g., assert that $reply begins with a numeric cursor followed by a space
or that it doesn't contain "ERR"), updating the assertion call (replace the
assert_match on "0 *" $reply) so the check is resilient to non-zero cursors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: af539798-df3e-4e38-bb9c-96582fc42132
📒 Files selected for processing (4)
src/networking.csrc/server.ctests/unit/cluster/replica-redirect.tcltests/unit/introspection.tcl
4ae9198 to
1ce5723
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/server.c`:
- Around line 4465-4466: The redirect condition currently includes
is_write_command and thus redirects keyless write commands; remove the write arm
so only read commands are redirected and ensure EXEC stays constrained by the
"all queued commands are readonly" rule. Concretely, change the if that
references server.cluster_enabled, obey_client, is_keyless, is_keyless_exec,
is_read_command, is_write_command, (c->capa & CLIENT_CAPA_REDIRECT) and
!c->flag.readonly to drop the is_write_command check (i.e., only allow
is_read_command to trigger redirect) and keep any EXEC handling gated by the
existing "all queued commands are readonly" check for is_keyless_exec. Ensure no
other code paths still redirect keyless write commands.
- Around line 4441-4442: The branch that computes is_keyless using static
command metadata (c->cmd->flags & CMD_MOVABLE_KEYS, c->cmd->key_specs_num,
c->cmd->proc) is incorrect; change it to determine keylessness from the prepared
request flags/results instead (use the prepare-time READ_FLAGS_NO_KEYS and, if
needed, check c->slot == -1) so runtime-resolved zero-key executions are treated
as keyless; update the condition where is_keyless is used (the cluster_enabled
&& !obey_client && !is_keyless branch) to rely on the new prepare-time check and
remove/replace references to CMD_MOVABLE_KEYS, key_specs_num and execCommand in
this logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: cdfd86e8-73f9-4bab-9411-fc02a3aed23b
📒 Files selected for processing (4)
src/networking.csrc/server.ctests/unit/cluster/replica-redirect.tcltests/unit/introspection.tcl
✅ Files skipped from review due to trivial changes (2)
- tests/unit/introspection.tcl
- src/networking.c
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/cluster/replica-redirect.tcl
…eadonly replicas Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
Signed-off-by: Yana Molodetsky <yamolodu@amazon.com>
1ce5723 to
fe0f56c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/server.c`:
- Around line 4466-4482: The redirect branch accidentally applies to keyless
subcommands inside MULTI before they are queued; change the guard so it does not
trigger for clients currently in a MULTI transaction (i.e. check client MULTI
state) so queued MULTI subcommands return QUEUED as intended; specifically,
update the big if that uses server.cluster_enabled / obey_client / is_keyless /
is_keyless_exec / is_read_command / is_write_command / (c->capa &
CLIENT_CAPA_REDIRECT) / !c->flag.readonly to also require the client is not in
MULTI (e.g. add a condition like !(c->flags & CLIENT_MULTI) or use the existing
clientInMulti(c) helper) so that queueMultiCommand() can handle MULTI semantics,
leaving the existing calls to discardTransaction(), flagTransaction(),
clusterNodeClientPort(), addReplyErrorSds(), moduleFireCommandRejectedEvent()
intact for non-MULTI cases.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 44e19ab9-f6cd-4f00-aa11-4bdc4a7e760f
📒 Files selected for processing (4)
src/networking.csrc/server.ctests/unit/cluster/replica-redirect.tcltests/unit/introspection.tcl
✅ Files skipped from review due to trivial changes (2)
- tests/unit/introspection.tcl
- src/networking.c
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/cluster/replica-redirect.tcl
Co-authored-by: Ran Shidlansik <ranshid@amazon.com> Signed-off-by: Yana Molodetsky <59420437+yanamolo@users.noreply.github.com>
When a replica receives a keyless command from a client that has negotiated
CAPA REDIRECTand has not sentREADONLY, redirect it to the primary with a-REDIRECTresponse.Previously, keyless commands like
SCAN, DBSIZE, RANDOMKEYalways executed locally on replicas, and keyless writes like FLUSHDB, FLUSHALL returned -READONLY error with no way for clients to discover the primary.This reuses the existing
CAPA REDIRECTcapability. The capa signals that the client can handle -REDIRECT responses — it does not prescribe when redirects are sent. Clients that send READONLY opt into local replica execution and are not redirected.Only commands with CMD_READONLY or CMD_WRITE are redirected, so connection/admin commands (PING, READONLY, CLIENT) remain unaffected. EXEC with all-keyless queued commands (c->slot == -1) is treated as
keyless and the transaction is discarded.
Closes #2991.