Fix heap-use-after-free in ACL LOAD when client free is deferred#3800
Conversation
When ACL LOAD deletes a user, it calls freeClientOrCloseLater() on clients authenticated as that user. If freeClient() defers the free (because the client is protected or has pending IO), the client remains in server.clients with c->user still pointing to the old user object. After raxFreeWithCallback(old_users, ...) frees all old users, c->user becomes a dangling pointer. Any subsequent code that dereferences c->user on the deferred client (e.g. CLIENT LIST via catClientInfoString, or clientsCron sub-functions) triggers a heap-use-after-free. Fix by setting c->user = NULL before calling freeClientOrCloseLater(), so the pointer is scrubbed before the old user objects are freed. The existing NULL guard in catClientInfoString (client->user ? ... : "(superuser)") handles this safely. Added DEBUG PROTECT-CLIENT <id> subcommand to enable deterministic reproduction by forcing freeClient to defer via protectClient(). Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughClears a client's ChangesACL Load Client User Pointer Fix
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
dvkashapov
left a comment
There was a problem hiding this comment.
LGTM, I should have thought about that in #3182 xD
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/debug.c`:
- Around line 1072-1080: The DEBUG subcommand list doesn't include the new
"protect-client" subcommand; update the DEBUG HELP output to advertise it by
adding a help line for "protect-client" to the same help table/string used to
generate DEBUG HELP (look for the HELP handling in debug.c—the function/array
that emits the DEBUG subcommands list, e.g., the DEBUG help text or
debugCommand/help branch). Add a concise entry matching the style of existing
entries, e.g. "protect-client <id> -- protect a client from eviction/closing",
and ensure formatting and ordering are consistent with the other DEBUG help
lines.
- Around line 1072-1074: The current parse of the client id using
strtoull(objectGetVal(c->argv[2]), NULL, 10) can accept partial numeric strings
like "12x"; change it to use a char *endptr and errno checks: call strtoull with
&endptr, set errno=0 beforehand, then ensure endptr points to the string
terminator (no extra characters), the start is not equal to endptr (must parse
at least one digit), and errno is not ERANGE (overflow). If any check fails,
return an error response and do not call lookupClientByID; update the
"protect-client" branch that reads c->argv[2] accordingly.
In `@tests/unit/acl.tcl`:
- Around line 1140-1141: The test currently only checks that the CLIENT LIST
output (variable cl from r CLIENT LIST) is non-empty, which can pass even if the
protected client is missing; update the assertion to explicitly verify the
protected client id is present in cl (e.g., assert {[string match -nocase
"*$protected_client_id*" $cl]} or use [string first $protected_client_id $cl] >=
0) and include a clear error message like "expected protected client
$protected_client_id in CLIENT LIST"; replace the existing assert {[string
length $cl] > 0} with this explicit presence check referencing cl and the
protected client id variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: e8c294db-3232-41a6-89d3-d9ee30aef147
📒 Files selected for processing (3)
src/acl.csrc/debug.ctests/unit/acl.tcl
enjoy-binbin
left a comment
There was a problem hiding this comment.
Makes sense to me, did not review the test code.
ba81baf to
2ca3e90
Compare
…rtion - Add PROTECT-CLIENT to DEBUG HELP output - Validate client ID with endptr/errno checks to reject partial numeric strings like "12x" - Strengthen test assertion to verify the protected client ID is present in CLIENT LIST output, not just that output is non-empty Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
2ca3e90 to
f8b51c7
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/acl.c`:
- Around line 2673-2674: Currently the code sets c->user = NULL before calling
freeClientOrCloseLater(c, 0), which momentarily makes the connection pass ACL
checks (NULL == allow-all); instead, rebind the client to the
unauthenticated/default ACL user (the same mechanism used in
ACLFreeUserAndKillClients) before calling freeClientOrCloseLater so the client
is not treated as privileged—i.e., replace c->user = NULL with code that assigns
the unauthenticated/default user (and handle refcounts if applicable) and then
call freeClientOrCloseLater(c, 0).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: aba235f7-73af-4cdf-840d-f6a610f11aec
📒 Files selected for processing (3)
src/acl.csrc/debug.ctests/unit/acl.tcl
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/acl.tcl`:
- Line 1141: The test currently uses assert_match "*id=$cid*" which can match
IDs like "id=${cid}7"; update the pattern to require a trailing space so the
client id is matched as a full token (e.g. change the glob to "*id=$cid *" or an
equivalent regex with a word boundary) in the assert_match call that references
$cid to avoid false positives.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: d015ac0c-4f98-48aa-ae5d-62ff3a31719c
📒 Files selected for processing (2)
src/debug.ctests/unit/acl.tcl
As noted in review, setting c->user = NULL momentarily makes the client pass ACL checks as unrestricted. Use clientSetUser(c, DefaultUser, 0) instead, consistent with ACLFreeUserAndKillClients. Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Add trailing space to glob pattern to avoid false positive on client IDs that are prefixes of other IDs (e.g. id=5 matching id=57). Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #3800 +/- ##
============================================
- Coverage 76.68% 76.63% -0.06%
============================================
Files 162 162
Lines 80697 80710 +13
============================================
- Hits 61885 61854 -31
- Misses 18812 18856 +44
🚀 New features to boost your workflow:
|
### Bug When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. ### Crash sequence 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** ### Fix Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. ### Test Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
### Bug When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. ### Crash sequence 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** ### Fix Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. ### Test Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
### Bug When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. ### Crash sequence 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** ### Fix Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. ### Test Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
# Backport sweep for 9.1 Automated cherry-picks from PRs marked "To be backported". ## Applied | Source PR | Title | Detail | |---|---|---| | #3743 | Fix buffered_reply assert in HFE commands with module keyspace notifications | cherry-picked in a prior sweep | | #3766 | Fix flaky block_keyspace_notification test for HGETDEL notify race | cherry-picked in a prior sweep | | #3800 | Fix heap-use-after-free in ACL LOAD when client free is deferred | cherry-picked in a prior sweep | | #3723 | Fix double-finish and RESP reply violation in cluster slot migration | cherry-picked in a prior sweep | | #3872 | Redacting customer information when hide_user_data_from_log is true in rdb.c, networking.c, debug.c and t_hash | cherry-picked in a prior sweep | | #3846 | Fix use-after-free in VM_RegisterClusterMessageReceiver | cherry-picked in a prior sweep | | #3806 | Add ALL_DBS flag to CLUSTER FLUSHSLOT for database-level ACL | cherry-picked in a prior sweep | | #3847 | Harden SENTINEL commands and config rewrite against control-character injection | | | #3801 | Validate every DB clause in COPY against ACL db= permissions | | | #3851 | Replace AUTOMATION_PAT with Valkeyrie Bot GitHub App token | | | #3848 | Fix cluster AUX-field control-character and delimiter injection | | | #3544 | Revert "IO-Threads redesign cleanup work (#3544)" | cherry-picked in a prior sweep | | #3888 | Report exact dbid for COPY in ACL LOG when db= access is denied | conflicts resolved by Claude Code | | #3942 | Fix shard_id format specifier in UPDATE message log | | | #3941 | Avoid random() % 0 undefined behaviour when cluster-node-timeout < 30 | | --- *Generated by valkey-ci-agent using Claude Code.* --------- Signed-off-by: Binbin <binloveplay1314@qq.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com> Signed-off-by: chx9 <lovelypiska@outlook.com> Signed-off-by: zackcam <zackcam@amazon.com> Signed-off-by: Eran Ifrah <eifrah@amazon.com> Signed-off-by: Jules Lasarte <lasartej@amazon.com> Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com> Signed-off-by: akash kumar <akumdev@amazon.com> Co-authored-by: Binbin <binloveplay1314@qq.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: lovelypiska <lovelypiska@outlook.com> Co-authored-by: zackcam <zackcam@amazon.com> Co-authored-by: eifrah-aws <eifrah@amazon.com> Co-authored-by: jjuleslasarte <jules.lasarte@gmail.com> Co-authored-by: Jules Lasarte <lasartej@amazon.com> Co-authored-by: Akash Kumar <45854686+akashkgit@users.noreply.github.com>
### Bug When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. ### Crash sequence 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** ### Fix Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. ### Test Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
When `ACL LOAD` deletes a user, it calls `freeClientOrCloseLater()` on clients authenticated as that user. If `freeClient()` defers the actual free (because the client has `flag.protected` set or has pending IO), the client remains in `server.clients` with `c->user` still pointing to the old user object. After `raxFreeWithCallback(old_users, ACLFreeUserVoid)` frees all old user objects, `c->user` becomes a dangling pointer. Any subsequent code that dereferences `c->user` on the deferred client triggers a heap-use-after-free. 1. Client A runs `ACL LOAD` 2. Client B is authenticated as a user that was removed from the ACL file 3. Client B has `flag.protected` set (e.g. has pending IO (IO threads) 4. `ACLLoadFromFile` → `freeClientOrCloseLater(B, 0)` → `freeClient(B)` → defers to `freeClientAsync` because B is protected 5. `raxFreeWithCallback(old_users, ...)` frees all old user objects including B's user 6. B is still in `server.clients` with `c->user` pointing to freed memory 7. `CLIENT LIST` or any path that calls `catClientInfoString` dereferences `c->user->name` → **UAF** Set `c->user` to default user before calling `freeClientOrCloseLater()` in `ACLLoadFromFile`. The existing NULL guard in `catClientInfoString` (`client->user ? client->user->name : "(superuser)"`) handles this safely. Added `DEBUG PROTECT-CLIENT <client-id>` subcommand that calls `protectClient()` on the target, forcing `freeClient` to defer to async. This enables deterministic reproduction without timing races. The test: 1. Creates a user, connects a client as that user 2. Protects the client via `DEBUG PROTECT-CLIENT` 3. Removes the user from the ACL file and runs `ACL LOAD` 4. Runs `CLIENT LIST` which dereferences `c->user->name` Without the fix, this crashes under ASAN with: ``` ERROR: AddressSanitizer: heap-use-after-free in catClientInfoString READ of size 8 freed by: raxRecursiveFree → raxFreeWithCallback → ACLLoadFromFile ``` --------- Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Bug
When
ACL LOADdeletes a user, it callsfreeClientOrCloseLater()on clients authenticated as that user. IffreeClient()defers the actual free (because the client hasflag.protectedset or has pending IO), the client remains inserver.clientswithc->userstill pointing to the old user object.After
raxFreeWithCallback(old_users, ACLFreeUserVoid)frees all old user objects,c->userbecomes a dangling pointer. Any subsequent code that dereferencesc->useron the deferred client triggers a heap-use-after-free.Crash sequence
ACL LOADflag.protectedset (e.g. has pending IO (IO threads)ACLLoadFromFile→freeClientOrCloseLater(B, 0)→freeClient(B)→ defers tofreeClientAsyncbecause B is protectedraxFreeWithCallback(old_users, ...)frees all old user objects including B's userserver.clientswithc->userpointing to freed memoryCLIENT LISTor any path that callscatClientInfoStringdereferencesc->user->name→ UAFFix
Set
c->userto default user before callingfreeClientOrCloseLater()inACLLoadFromFile. The existing NULL guard incatClientInfoString(client->user ? client->user->name : "(superuser)") handles this safely.Test
Added
DEBUG PROTECT-CLIENT <client-id>subcommand that callsprotectClient()on the target, forcingfreeClientto defer to async. This enables deterministic reproduction without timing races.The test:
DEBUG PROTECT-CLIENTACL LOADCLIENT LISTwhich dereferencesc->user->nameWithout the fix, this crashes under ASAN with: