Force listpack validation on load to prevent deferred assertions by jjuleslasarte · Pull Request #3721 · valkey-io/valkey

jjuleslasarte · 2026-05-14T18:31:55Z

Until now, deep santitization of listpack on load (RDB load, RESTORE) was configurable. That meant that we had to do validation on access and traversal of the listpack. This change:

Force sanitization of listpacks on load and import.
Deprecates the config sanitize-dump-payload and makes it a no-op.
Deprecates the ACL flags skip-sanitize-payload and sanitize-payload, ignoring them when loading old ACL files.
Removes validation asserts on access.

Benefits:

Prevents deferred assertions, which would happen if unsanitized listpacks were loaded and later accessed, leading to server shutdown.
Improves read performance.

Replaces: #399

Internal changes

Remove lpAssertValidEntry calls from lpNext, lpPrev, lpFirst, lpFind, and lpDeleteRangeWithEntry to eliminate redundant per-operation listpack validation. Safety is maintained by making deep_integrity_validation unconditional in rdbLoadObject, ensuring all data structures (listpack, intset, stream) are fully validated on every load path (RDB and RESTORE). This effectively makes sanitize-dump-payload a no-op.

Lightweight EOF position assertions from PR #2027 remain intact.

Benchmarks

Benchmark results (Graviton3, 200 clients, P=32, 8 threads):

  256-field hash HGET: +4.7% throughput (907K -> 950K rps)
  256-field hash HSET: +1.2% throughput (688K -> 696K rps)
  100-field hash HSET: +2.4% throughput (688K -> 704K rps)
  p99 latency (256-field HSET): -16% (11.15ms -> 9.38ms)
  RDB load (1M keys): +1% overhead

Benchmark details

CPU: ARM Graviton3
Cores: 64
Memory: 247 GB
Each benchmark: 5M requests, 200 clients, pipeline 32, 8 threads, 3 runs per configuration
hash-max-listpack-entries set to 512 to force listpack encoding for large hashes
IO threads: 1 (default, single-threaded mode)

Results

256-Field Hash — HSET (updating existing field in 10K pre-populated hashes)

Server	Run 1 (rps)	Run 2 (rps)	Run 3 (rps)	Average (rps)
BEFORE (with lpAssertValidEntry)	688,137	687,947	687,663	687,916
AFTER (lpAssertValidEntry removed)	712,555	687,758	687,663	695,992

Delta: +1.2% (AFTER faster)

256-Field Hash — HGET (reading field from 10K pre-populated hashes)

Server	Run 1 (rps)	Run 2 (rps)	Run 3 (rps)	Average (rps)
BEFORE (with lpAssertValidEntry)	907,276	906,947	906,783	907,002
AFTER (lpAssertValidEntry removed)	949,848	950,390	950,029	950,089

Delta: +4.7% (AFTER faster)

100-Field Hash — HSET

Server	Run 1 (rps)	Run 2 (rps)	Run 3 (rps)	Average (rps)
BEFORE (with lpAssertValidEntry)	687,663	688,042	688,137	687,947
AFTER (lpAssertValidEntry removed)	687,474	712,454	712,860	704,263

Delta: +2.4% (AFTER faster)

100-Field Hash — HGET

Server	Run 1 (rps)	Run 2 (rps)	Run 3 (rps)	Average (rps)
BEFORE (with lpAssertValidEntry)	950,570	950,209	950,209	950,329
AFTER (lpAssertValidEntry removed)	950,390	950,209	949,668	950,089

Delta: ~0% (no significant difference)

Small Hash — HSET (default 3-field hashes, baseline)

Server	Run 1 (rps)	Run 2 (rps)	Run 3 (rps)	Average (rps)
BEFORE (with lpAssertValidEntry)	643,583	643,666	664,894	650,714
AFTER (lpAssertValidEntry removed)	643,583	664,982	665,336	657,967

Delta: +1.1% (AFTER faster)

RDB Load (1M keys: 500K hashes, 300K sets, 200K strings)

Server	Run 1 (sec)	Run 2 (sec)	Run 3 (sec)	Average (sec)
BEFORE (no deep validation)	0.483	0.486	0.492	0.487
AFTER (with deep_integrity_validation)	0.516	0.476	0.484	0.492

Delta: +1.0% slower (AFTER slightly slower due to integrity validation during load)

Note: The AFTER binary enables deep_integrity_validation during RDB load, which performs the equivalent of lpAssertValidEntry checks only at load time.

flamegraphs

Perf Recording (10M HSET operations on 256-field hashes)

Metric	BEFORE	AFTER
Throughput during recording	687,711 rps	712,606 rps
p50 latency	9.167 ms	8.879 ms
p99 latency	11.151 ms	9.375 ms

BEFORE	AFTER

The AFTER flamegraph shows the elimination of lpAssertValidEntry from the hot path, which was previously called on every lpNext/lpPrev/lpSeek operation during hash field lookups and updates.

Notes

Read-heavy workloads (HGET) on large hashes see the biggest improvement: +4.7% on 256-field hashes. This is expected since HGET traverses the listpack to find the field, and lpAssertValidEntry was called at each step.
Write workloads (HSET) show modest improvement: +1-2.4% across configurations. The write path has more overhead beyond traversal (memory allocation, reallocation), so the relative impact of removing the assertion is smaller.
Small hashes show minimal difference: ~1% improvement. With only 3 fields, the listpack traversal is trivial and the assertion cost is amortized across other work
RDB load is essentially unchanged: The deep_integrity_validation flag adds ~1% overhead during load, which should be acceptable since it only runs once at startup and provides the same safety guarantees that lpAssertValidEntry previously provided on every operation.
Latency improvements are notable: The p99 latency dropped from 11.15ms to 9.38ms during the flamegraph recording (256-field HSET), a 16% reduction in tail latency.
The change is a net positive: Runtime performance improves across all hash sizes, with the largest gains on the workloads that matter most (large listpack-encoded hashes). Data integrity is preserved through validation at RDB load time.

coderabbitai · 2026-05-14T18:32:30Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Remove per-entry listpack assertions; make listpack integrity validation unconditional and deep at RDB/RESTORE/load time; remove sanitize-dump-payload config/ACL flags (deprecated no-ops); and update tests and fuzzers to expect RESTORE rejection for corrupt payloads and verify server liveness.

Changes

Validation Pipeline Refactor

Layer / File(s)	Summary
Listpack traversal & public API changes `src/listpack.c`, `src/listpack.h`, `src/unit/test_listpack.cpp`	Removed internal `lpAssertValidEntry()` and per-entry assertions from traversal/mutation paths; `lpFind` advances using `lpSkip()` and uses bounds `assert()`; `lpValidateIntegrity` API now always scans entries; unit tests and header updated.
Unconditional deep validation during RDB load `src/rdb.c`, `src/t_stream.c`, `src/stream.h`, `src/valkey-check-rdb.c`	`rdbLoadObject()` and `lpValidateIntegrityAndDups()` drop the `deep` parameter and run deep listpack integrity validation unconditionally; quicklist/set/zset/hash/stream/intset/PEL checks and sanitization counters are now unconditional and call sites updated to new signatures.
Remove sanitize-dump-payload config and ACL flags `src/server.h`, `src/config.c`, `src/acl.c`, `src/fuzzer_command_generator.c`, `valkey.conf`	Removed `SANITIZE_DUMP_*` macros and `server.sanitize_dump_payload` field; dropped `sanitize-payload`/`skip-sanitize-payload` from emitted ACL flags and defaults while accepting them as deprecated no-ops; moved config to deprecated list and updated enum handling and docs.
Consolidate fuzzers and update corrupt-dump tests `tests/integration/`, `tests/integration/corrupt-dump-fuzzer.tcl`, `tests/unit/`, `tests/integration/corrupt-dump.tcl`	Consolidated fuzzer test run (removed per-run sanitize param), removed per-test `sanitize-dump-payload` toggles across many corrupt-dump cases, standardized RESTORE failure + liveness checks and integrity-log assertions, and added a dedicated empty-keys corrupted-RDB test; updated unit/module ACL expectations.

Sequence Diagram (RDB load validation):

sequenceDiagram
  participant rdbLoad as rdbLoadObject()
  participant LPdup as lpValidateIntegrityAndDups()
  participant Loader as Quicklist/Set/Zset/Hash/Stream loaders
  rdbLoad->>LPdup: always invoke deep validation (pairs)
  Loader->>LPdup: call with updated signature
  rdbLoad->>Loader: proceed with load after validation

Estimated code review effort:
🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers:

zuiderkwast

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and specifically describes the main change: making listpack validation unconditional on load to eliminate runtime assertions.
Description check	✅ Passed	The description thoroughly explains the change, its rationale, internal mechanics, performance benefits, benchmarks, and security implications related to the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

zuiderkwast

Nice!

The main point for this change IMO is security.

Scenario:

A user has access to RESTORE for legitimate reasons, to migrate data from some other node using DUMP/RESTORE.
The client is compromized and the attacker sends a RESTORE with a corrupted listpack. It passes.
Later, on access, the server runs lpAssertValidEntry and hits the assert.
Result: segfault. Server is down. Denial of service.

Performance is a secondary thing. Thanks for the benchmarks!

We should also note that full sync and RESTORE get slower for users who had validation turned off before.

Please also:

Update config.c to move the config to the list of deprecated configs (so that we still accept it but it's a no-op).
Update valkey.conf the section about this config, possibly remove the section entirely or perhaps just keep one line saying that it's deprecated in Valkey 10 and has no effect.
Remove that ACL flag related to this, mentioned in the chat.

dvkashapov

As @zuiderkwast mentioned: its also important to deprecate config, mention that in valkey.conf and delete ACL user flag. Other than that looks good! Also we would need to update documentation to mention those changes

We should also note that full sync and RESTORE get slower for users who had validation turned off before.

I think we can skip this because of benchmark results: RDB load (1M keys): +1% overhead

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/corrupt-dump-fuzzer.tcl`:
- Around line 86-90: The FreeBSD branch still allows the fuzzer to run once by
setting min_cycles and min_duration to 1; change the FreeBSD handling (the
conditional that checks [exec uname] == "FreeBSD") so it skips execution
entirely instead of reducing to one run—e.g., exit or set a skip flag before the
fuzzer loop—or add an early return/abort when running on FreeBSD so the fuzzer
loop that uses min_cycles/min_duration is never invoked.
- Around line 183-191: When valgrind or ASan issues are detected by
find_valgrind_errors or sanitizer_errors_from_file the test only logs and
restarts; change the handling so the fuzzer run fails immediately for those
payloads by invoking the test failure path (e.g., call fail with a descriptive
message) in addition to setting report_and_restart and print_commands. Update
the block that checks valgrind_errors and asan_errors so that when either is
non-empty it calls fail (including printable_dump and the error details) to
ensure the test run is marked as failed rather than silently continuing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c84d6888-0ca4-47b0-960d-4f1024481c7f

📥 Commits

Reviewing files that changed from the base of the PR and between 5faaabb and 2e1cb98.

📒 Files selected for processing (11)

src/acl.c
src/config.c
src/fuzzer_command_generator.c
src/server.h
src/valkey-check-rdb.c
tests/integration/corrupt-dump-fuzzer.tcl
tests/integration/corrupt-dump.tcl
tests/integration/rdb.tcl
tests/unit/acl.tcl
tests/unit/moduleapi/usercall.tcl
valkey.conf

💤 Files with no reviewable changes (4)

src/valkey-check-rdb.c
src/fuzzer_command_generator.c
tests/integration/rdb.tcl
src/acl.c

✅ Files skipped from review due to trivial changes (1)

tests/unit/moduleapi/usercall.tcl

jjuleslasarte · 2026-05-15T16:09:02Z

@dvkashapov @zuiderkwast once this is approved, should i raise the corresponding docs PR?

zuiderkwast · 2026-05-15T20:02:46Z

The PR title is not very understandable for users, e.g. in release notes. Can you change it to describe behaviour instead of internal code? I mean that we sanitize on load, not on access and traversal.

I don't like to use the "semanic commit" title style. We don't use it normally so the few PRs that do this stick out.

Also can you mention anything about the security aspect in the title or description? That forcing sanitization at load time prevents invalid data from being loaded into memory and later hitting an assert.

jjuleslasarte · 2026-05-16T17:48:43Z

The PR title is not very understandable for users, e.g. in release notes. Can you change it to describe behaviour instead of internal code? I mean that we sanitize on load, not on access and traversal.

I don't like to use the "semanic commit" title style. We don't use it normally so the few PRs that do this stick out.

Also can you mention anything about the security aspect in the title or description? That forcing sanitization at load time prevents invalid data from being loaded into memory and later hitting an assert.

ACK, will do. Yeah, the semantic commit style is my bad, i had an LLM do it for me and they are really prone to it.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/rdb.c (1)
1900-1905: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the helper comment to match the new API.

Lines 1900-1904 still describe a removed deep parameter, so this now documents behavior the function no longer supports.

As per coding guidelines, "Document why code exists, not just what it does; document all functions in C code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rdb.c` around lines 1900 - 1905, Update the function comment for
lpValidateIntegrityAndDups to reflect the current API (remove any mention of the
removed `deep` parameter) and explain why the validator exists; describe what
the function validates now (header integrity and full listpack scanning
behavior), the meaning of the `pairs` parameter (0 => unique elements as set, 1
=> odd elements unique as key/value map), the expected inputs (`unsigned char
*lp`, `size_t size`) and what the return value indicates, and add a short note
about when callers should use this check (purpose/context of the validation).

🧹 Nitpick comments (1)

src/rdb.c (1)
1934-1938: Request @core-team review for this validation-policy change.

This moves corruption handling behavior in rdbLoadObject(), which is exactly the kind of rdb.c change that should get explicit architectural sign-off before merge.

As per coding guidelines, "Request @core-team architectural review for changes to cluster*.c, replication.c, rdb.c, or aof.c".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rdb.c` around lines 1934 - 1938, You changed corruption-handling behavior
by forcing deep_integrity_validation=1 in rdb.c (affects rdbLoadObject()); this
requires an explicit architectural review—update the PR and code with an
explicit request for `@core-team` review: add a short comment next to the
deep_integrity_validation assignment noting the behavioral change and the need
for core-team sign-off, and update the PR description to request `@core-team`
review and summarize the rationale and potential impact on replication/load
semantics so reviewers can evaluate the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/rdb.c`:
- Around line 1900-1905: Update the function comment for
lpValidateIntegrityAndDups to reflect the current API (remove any mention of the
removed `deep` parameter) and explain why the validator exists; describe what
the function validates now (header integrity and full listpack scanning
behavior), the meaning of the `pairs` parameter (0 => unique elements as set, 1
=> odd elements unique as key/value map), the expected inputs (`unsigned char
*lp`, `size_t size`) and what the return value indicates, and add a short note
about when callers should use this check (purpose/context of the validation).

---

Nitpick comments:
In `@src/rdb.c`:
- Around line 1934-1938: You changed corruption-handling behavior by forcing
deep_integrity_validation=1 in rdb.c (affects rdbLoadObject()); this requires an
explicit architectural review—update the PR and code with an explicit request
for `@core-team` review: add a short comment next to the deep_integrity_validation
assignment noting the behavioral change and the need for core-team sign-off, and
update the PR description to request `@core-team` review and summarize the
rationale and potential impact on replication/load semantics so reviewers can
evaluate the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 269010ad-3664-4aaa-a7b3-396868bacf25

📥 Commits

Reviewing files that changed from the base of the PR and between 2e1cb98 and 7345a31.

📒 Files selected for processing (9)

src/acl.c
src/listpack.c
src/listpack.h
src/rdb.c
src/stream.h
src/t_stream.c
src/unit/test_listpack.cpp
tests/integration/corrupt-dump-fuzzer.tcl
tests/unit/acl.tcl

🚧 Files skipped from review as they are similar to previous changes (2)

src/acl.c
tests/integration/corrupt-dump-fuzzer.tcl

Since validation is now always unconditional, the deep_integrity_validation variable (hardcoded to 1) and all its conditionals are dead code. Remove them and make the guarded operations unconditional. Addresses review feedback from zuiderkwast on PR valkey-io#3721. Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

codecov · 2026-05-26T18:26:49Z

Codecov Report

❌ Patch coverage is 96.77419% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.71%. Comparing base (93e84bd) to head (9fdb4e5).
⚠️ Report is 2 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/unit/test_listpack.cpp	66.66%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #3721      +/-   ##
============================================
+ Coverage     76.68%   76.71%   +0.02%     
============================================
  Files           162      162              
  Lines         80706    80679      -27     
============================================
- Hits          61892    61889       -3     
+ Misses        18814    18790      -24

Files with missing lines	Coverage Δ
src/acl.c	`92.67% <100.00%> (+0.08%)`	⬆️
src/config.c	`78.46% <ø> (+0.36%)`	⬆️
src/fuzzer_command_generator.c	`77.15% <ø> (+0.23%)`	⬆️
src/listpack.c	`92.86% <ø> (-0.27%)`	⬇️
src/rdb.c	`76.94% <100.00%> (-0.22%)`	⬇️
src/server.h	`100.00% <ø> (ø)`
src/t_stream.c	`94.45% <100.00%> (-0.01%)`	⬇️
src/valkey-check-rdb.c	`70.82% <ø> (-0.06%)`	⬇️
src/unit/test_listpack.cpp	`79.43% <66.66%> (ø)`

... and 18 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/unit/test_listpack.cpp (1)
943-950: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add a corrupt-payload regression for the new always-deep validator.

This still exercises only the happy path, so it won't catch a regression where lpValidateIntegrity() accepts a listpack with a valid header but a broken entry/backlen. A small negative test here would lock in the new contract.

As per coding guidelines src/unit/**/*.{c,cpp}: Data-structure and low-level logic changes should be covered by unit tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/unit/test_listpack.cpp` around lines 943 - 950, Add a negative regression
case to TEST_F(ListpackTest, listpackLpValidateIntegrity) that mutates the
created listpack (created via createList()) to simulate a broken entry or
corrupt backlen (for example flip a byte in the payload or make the backlen
inconsistent) and then call lpValidateIntegrity(lp, lpBytes(lp), lpValidation,
&count) expecting a failure (return 0) before freeing with lpFree(lp); reference
the existing test and functions lpValidateIntegrity, lpBytes, lpValidation,
createList, and lpFree so the test asserts the validator rejects a listpack with
a valid header but corrupted entry/backlen.
src/config.c (1)
454-462: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep CONFIG SET sanitize-dump-payload as a deprecated no-op.

This table is only used while parsing startup config. With sanitize-dump-payload no longer registered in static_configs, runtime CONFIG SET sanitize-dump-payload ... now falls through lookupConfig() and returns "Unknown option", which is a compatibility break for existing automation. Please preserve a hidden/runtime no-op path as well.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/config.c` around lines 454 - 462, The startup parsing table
deprecated_configs no longer preserves a runtime no-op for
"sanitize-dump-payload", causing lookupConfig()/CONFIG SET to return "Unknown
option"; restore the entry so CONFIG SET sanitize-dump-payload remains accepted
as a deprecated no-op at runtime. Edit the deprecated_configs array to include
the "sanitize-dump-payload" entry (matching the format used for other deprecated
options), ensure static_configs can omit it but lookupConfig()/configSet path
still recognizes it as deprecated and performs no-op behavior, and keep any
related parsing/validation logic in config.c (referencing deprecated_configs,
lookupConfig, static_configs, and the CONFIG SET handling) consistent with other
deprecated entries.

🧹 Nitpick comments (2)

src/rdb.c (2)
1900-1913: ⚡ Quick win

Update the lpValidateIntegrityAndDups() contract comment.

The doc block still describes the removed deep argument, so it now documents behavior this function no longer supports.

As per coding guidelines src/**/*.{c,h}: Use comments for non-obvious behavior and rationale, not for restating code and Document why code exists, not just what it does; document all functions in C code.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rdb.c` around lines 1900 - 1913, Update the function docblock for
lpValidateIntegrityAndDups to remove the stale mention of the removed `deep`
parameter and instead accurately document the current contract: that it
validates the listpack header and entries via lpValidateIntegrity with
_lpEntryValidation, that `pairs` controls duplicate-checking semantics (0 = all
elements must be unique, 1 = odd elements must be unique as map keys), and that
it uses an internal hashtable to track seen fields and returns validation
status; include brief rationale for why duplicate checking is separated from
basic integrity validation. Reference lpValidateIntegrityAndDups, the `pairs`
parameter, lpValidateIntegrity, and _lpEntryValidation when editing the comment.
1905-1913: Please request @core-team review for these src/rdb.c load-path changes.

This PR changes RDB/RESTORE validation behavior in a file the repository explicitly flags for architectural review.

As per coding guidelines src/{cluster*.c,replication.c,rdb.c,aof.c}: Request @core-team architectural review for changes to cluster*.c, replication.c, rdb.c, or aof.c.

Also applies to: 2321-2323, 2453-2558, 2616-2617, 2858-2870
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rdb.c` around lines 1905 - 1913, This change touches core RDB load-path
validation in src/rdb.c (notably the lpValidateIntegrityAndDups function and
surrounding lpValidateIntegrity/_lpEntryValidation usage) and per project policy
requires an explicit `@core-team` architectural review; update the PR by adding a
clear request for `@core-team` review in the PR description and/or add a code
comment at the top of the modified region referencing the policy and tagging
`@core-team` so reviewers see this change to rdb.c, and ensure the other mentioned
ranges (around lines for lpValidateIntegrityAndDups and the 2321-2323,
2453-2558, 2616-2617, 2858-2870 edits) are similarly annotated/requested for
core-team review.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/config.c`:
- Around line 454-462: The startup parsing table deprecated_configs no longer
preserves a runtime no-op for "sanitize-dump-payload", causing
lookupConfig()/CONFIG SET to return "Unknown option"; restore the entry so
CONFIG SET sanitize-dump-payload remains accepted as a deprecated no-op at
runtime. Edit the deprecated_configs array to include the
"sanitize-dump-payload" entry (matching the format used for other deprecated
options), ensure static_configs can omit it but lookupConfig()/configSet path
still recognizes it as deprecated and performs no-op behavior, and keep any
related parsing/validation logic in config.c (referencing deprecated_configs,
lookupConfig, static_configs, and the CONFIG SET handling) consistent with other
deprecated entries.

In `@src/unit/test_listpack.cpp`:
- Around line 943-950: Add a negative regression case to TEST_F(ListpackTest,
listpackLpValidateIntegrity) that mutates the created listpack (created via
createList()) to simulate a broken entry or corrupt backlen (for example flip a
byte in the payload or make the backlen inconsistent) and then call
lpValidateIntegrity(lp, lpBytes(lp), lpValidation, &count) expecting a failure
(return 0) before freeing with lpFree(lp); reference the existing test and
functions lpValidateIntegrity, lpBytes, lpValidation, createList, and lpFree so
the test asserts the validator rejects a listpack with a valid header but
corrupted entry/backlen.

---

Nitpick comments:
In `@src/rdb.c`:
- Around line 1900-1913: Update the function docblock for
lpValidateIntegrityAndDups to remove the stale mention of the removed `deep`
parameter and instead accurately document the current contract: that it
validates the listpack header and entries via lpValidateIntegrity with
_lpEntryValidation, that `pairs` controls duplicate-checking semantics (0 = all
elements must be unique, 1 = odd elements must be unique as map keys), and that
it uses an internal hashtable to track seen fields and returns validation
status; include brief rationale for why duplicate checking is separated from
basic integrity validation. Reference lpValidateIntegrityAndDups, the `pairs`
parameter, lpValidateIntegrity, and _lpEntryValidation when editing the comment.
- Around line 1905-1913: This change touches core RDB load-path validation in
src/rdb.c (notably the lpValidateIntegrityAndDups function and surrounding
lpValidateIntegrity/_lpEntryValidation usage) and per project policy requires an
explicit `@core-team` architectural review; update the PR by adding a clear
request for `@core-team` review in the PR description and/or add a code comment at
the top of the modified region referencing the policy and tagging `@core-team` so
reviewers see this change to rdb.c, and ensure the other mentioned ranges
(around lines for lpValidateIntegrityAndDups and the 2321-2323, 2453-2558,
2616-2617, 2858-2870 edits) are similarly annotated/requested for core-team
review.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 712e1328-c295-49ba-bacc-d1c6cc48ab05

📥 Commits

Reviewing files that changed from the base of the PR and between 7345a31 and 2d342fe.

📒 Files selected for processing (18)

src/acl.c
src/config.c
src/fuzzer_command_generator.c
src/listpack.c
src/listpack.h
src/rdb.c
src/server.h
src/stream.h
src/t_stream.c
src/unit/test_listpack.cpp
src/valkey-check-rdb.c
tests/integration/corrupt-dump-fuzzer.tcl
tests/integration/corrupt-dump.tcl
tests/integration/rdb.tcl
tests/unit/acl.tcl
tests/unit/cluster/resharding.tcl
tests/unit/moduleapi/usercall.tcl
valkey.conf

💤 Files with no reviewable changes (3)

tests/unit/cluster/resharding.tcl
tests/integration/rdb.tcl
src/valkey-check-rdb.c

✅ Files skipped from review due to trivial changes (2)

tests/unit/moduleapi/usercall.tcl
valkey.conf

zuiderkwast

Looks good, thank you! Only a few non-blocking nits.

Remove lpAssertValidEntry calls from lpNext, lpPrev, lpFirst, lpFind, and lpDeleteRangeWithEntry to eliminate redundant per-operation listpack validation. Safety is maintained by making deep_integrity_validation unconditional in rdbLoadObject, ensuring all data structures (listpack, intset, stream) are fully validated on every load path (RDB and RESTORE). This effectively makes sanitize-dump-payload a no-op. Benchmark results (Graviton3, 200 clients, P=32, 8 threads): 256-field hash HGET: +4.7% throughput (907K -> 950K rps) 256-field hash HSET: +1.2% throughput (688K -> 696K rps) 100-field hash HSET: +2.4% throughput (688K -> 704K rps) p99 latency (256-field HSET): -16% (11.15ms -> 9.38ms) RDB load (1M keys): +1% overhead (acceptable) Lightweight EOF position assertions from PR valkey-io#2027 remain intact. Takes over: valkey-io#399 Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

Since deep_integrity_validation is now unconditional, the sanitize-dump-payload config and skip-sanitize-payload ACL flag have no effect. This commit: - Moves sanitize-dump-payload to deprecated_configs[] in config.c - Removes SANITIZE_DUMP_* macros and server.sanitize_dump_payload field - Removes skip-sanitize-payload/sanitize-payload ACL flags entirely - Updates valkey.conf with deprecation notice - Deletes duplicate unsanitized test variants that are now redundant - Cleans up corrupt-dump-fuzzer.tcl and rdb.tcl test references Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

…cated ACL flags - Remove the 'deep' parameter from lpValidateIntegrity() since validation is now always full (deep). Update all callers in rdb.c, t_stream.c, and unit tests. - Remove 'deep' parameter from lpValidateIntegrityAndDups() and streamValidateListpackIntegrity() for the same reason. - Accept 'skip-sanitize-payload' and 'sanitize-payload' as no-op ACL flags for backwards compatibility with Valkey 9 ACL files. These flags are parsed but not emitted, so they get silently dropped on next ACL save. Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

Address PR review feedback on the corrupt-dump-fuzzer integration test: - Skip the test entirely on FreeBSD instead of running a single short cycle. The previous workaround masked any signal the fuzzer might produce on that platform; honestly skipping it is a clearer signal than 'ran but found nothing'. - Track sanitizer (valgrind/asan) findings in a new stat_sanitizer_findings counter and assert it is zero at the end of the run. Previously, sanitizer hits were only logged and the server was restarted, so a real bug surfaced via ASan or Valgrind would not fail the test. Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

Since validation is now always unconditional, the deep_integrity_validation variable (hardcoded to 1) and all its conditionals are dead code. Remove them and make the guarded operations unconditional. Addresses review feedback from zuiderkwast on PR valkey-io#3721. Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

Remove the unused outer 'size_t bytes' declarations in lpNext() and lpFirst() that became dead code after lpAssertValidEntry was removed. Fixes build failure with -Werror (unused-variable warning) in CI. Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

zuiderkwast · 2026-05-27T17:13:36Z

@dvkashapov @zuiderkwast once this is approved, should i raise the corresponding docs PR?

Yes, but I don't know which docs are affected. ACL docs?

dvkashapov · 2026-05-28T06:51:31Z

Yes, we need to update ACL docs to mention that from next release those user flags will be no-op and same in config file

…key-io#3721) Until now, deep santitization of listpack on load (RDB load, RESTORE) was configurable. That meant that we had to do validation on access and traversal of the listpack. This change: - Force sanitization of listpacks on load and import. - Deprecates the config `sanitize-dump-payload` and makes it a no-op. - Deprecates the ACL flags `skip-sanitize-payload` and `sanitize-payload`, ignoring them when loading old ACL files. - Removes validation asserts on access. Benefits: - Prevents deferred assertions, which would happen if unsanitized listpacks were loaded and later accessed, leading to server shutdown. - Improves read performance. Replaces: valkey-io#399 Signed-off-by: jjuleslasarte <jules.lasarte@gmail.com>

As per valkey-io/valkey#3721 and valkey-io/valkey#3964, omit `alldbs` rules and drop mentions of `sanitize-payload` and `skip-sanitize-payload` user flags. Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>

github-actions Bot assigned jjuleslasarte May 14, 2026

zuiderkwast reviewed May 14, 2026

View reviewed changes

Comment thread tests/integration/corrupt-dump.tcl

dvkashapov reviewed May 15, 2026

View reviewed changes

Comment thread src/rdb.c Outdated

dvkashapov added this to Valkey 10 May 15, 2026

dvkashapov added the needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. label May 15, 2026

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Comment thread tests/integration/corrupt-dump-fuzzer.tcl Outdated

Comment thread tests/integration/corrupt-dump-fuzzer.tcl

jjuleslasarte requested review from dvkashapov and zuiderkwast May 15, 2026 16:09

zuiderkwast reviewed May 15, 2026

View reviewed changes

Comment thread src/acl.c

madolson moved this to Todo in Valkey 10 May 18, 2026

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

zuiderkwast reviewed May 22, 2026

View reviewed changes

Comment thread src/rdb.c Outdated

zuiderkwast changed the title ~~perf(listpack): Remove lpAssertValidEntry from traversal functions~~ Validate listpacks on load instead of on access to prevent deferred assertions May 22, 2026

zuiderkwast changed the title ~~Validate listpacks on load instead of on access to prevent deferred assertions~~ Force listpack validation on load to prevent deferred assertions May 22, 2026

zuiderkwast mentioned this pull request May 22, 2026

Document database-level ACL rules in valkey.conf and acl.c #3808

Merged

jjuleslasarte force-pushed the remove-lpAssertValidEntry branch from 7345a31 to ce2f9ff Compare May 26, 2026 17:27

jjuleslasarte force-pushed the remove-lpAssertValidEntry branch from ce2f9ff to 08a953f Compare May 26, 2026 18:04

jjuleslasarte requested a review from zuiderkwast May 26, 2026 19:55

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

zuiderkwast approved these changes May 27, 2026

View reviewed changes

Comment thread src/listpack.c Outdated

Comment thread src/listpack.c Outdated

Comment thread src/listpack.c Outdated

zuiderkwast added the release-notes This issue should get a line item in the release notes label May 27, 2026

zuiderkwast moved this from Todo to Needs Review in Valkey 10 May 27, 2026

jjuleslasarte force-pushed the remove-lpAssertValidEntry branch from 81ce181 to d7879ae Compare May 27, 2026 15:45

jjuleslasarte force-pushed the remove-lpAssertValidEntry branch from d7879ae to 9fdb4e5 Compare May 27, 2026 15:46

jjuleslasarte added 6 commits May 27, 2026 16:16

jjuleslasarte force-pushed the remove-lpAssertValidEntry branch from 9fdb4e5 to f42aa10 Compare May 27, 2026 16:17

zuiderkwast merged commit e4fdae4 into valkey-io:unstable May 27, 2026
60 of 61 checks passed

zuiderkwast mentioned this pull request May 28, 2026

Remove lpAssertValidEntry from listpack functions to achieve up to 10% improvement on HSET command. #399

Closed

coderabbitai Bot mentioned this pull request Jun 5, 2026

Harden stream validation on RDB load against crafted metadata #3922

Open

dvkashapov mentioned this pull request Jun 10, 2026

Omit alldbs rule and drop mentions of sanitize-payload user flags valkey-io/valkey-doc#448

Merged

coderabbitai Bot mentioned this pull request Jun 16, 2026

Add ACL role support #3967

Open

Uh oh!

Conversation

jjuleslasarte commented May 14, 2026 • edited by zuiderkwast Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Internal changes

Benchmarks

Benchmark details

Results

256-Field Hash — HSET (updating existing field in 10K pre-populated hashes)

256-Field Hash — HGET (reading field from 10K pre-populated hashes)

100-Field Hash — HSET

100-Field Hash — HGET

Small Hash — HSET (default 3-field hashes, baseline)

RDB Load (1M keys: 500K hashes, 300K sets, 200K strings)

flamegraphs

Perf Recording (10M HSET operations on 256-field hashes)

Notes

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dvkashapov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jjuleslasarte commented May 15, 2026

Uh oh!

Uh oh!

zuiderkwast commented May 15, 2026

Uh oh!

jjuleslasarte commented May 16, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zuiderkwast commented May 27, 2026

Uh oh!

dvkashapov commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jjuleslasarte commented May 14, 2026 •

edited by zuiderkwast

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading

dvkashapov left a comment •

edited

Loading

codecov Bot commented May 26, 2026 •

edited

Loading