Inline compression design (initial draft) by ikolomi · Pull Request #1 · ikolomi/valkey

ikolomi · 2026-04-27T14:07:08Z

No description provided.

Signed-off-by: ikolomi <ikolomin@amazon.com>

Signed-off-by: Gilboab <97948000+GilboaAWS@users.noreply.github.com>

Update architecture.md - Fix Mermaid Parsing error

ikolomi · 2026-05-11T10:59:11Z

Design walkthrough complete — 31/31 review threads addressed

All 31 inline review threads on this PR have been resolved via a sequential design walkthrough completed 2026-05-10. Per-thread replies summarizing each decision have been posted on the original threads; GitHub resolution state is now 31/31 resolved.

Full audit trail: DESIGN_TODO.md — one entry per thread with status, decision, resolved_by, permalink.

Substantive design changes merged during the walkthrough

Defaults revised: compression-max-value-size default lowered 1 MiB → 128 KiB (Thread Inline compression design (initial draft) #1)
Replication: full-sync RDB is always uncompressed in v1; disk RDB still compressed; wire-level compression negotiation (REPLCONF compression) deferred to v2 (Threads Update architecture.md - Fix Mermaid Parsing error #2, [S2.9 / C3] Sweeper engine + COMPRESSION SWEEP FORCE #31) — new §2.6 R2.6.8
Config surface: §2.12 split into 5 Primary + 11 Advanced knobs (Thread Inline compression: Phase 0 skeleton #3)
Snapshot safety: SDS immutability enforced via existing dbUnshareStringValue COW + merge-blocker audit + new compression-cow-invariant.tcl test — R2.4.4–R2.4.6 + §7.2 (Thread [plan] Reassign S1 to @GilboaAWS, S6 to @ikolomi, mark Phase 0 complete #4)
Benchmarking: new §7.5 compression-aware benchmark suite extending valkey-benchmark (Thread [deps] Vendor ZSTD 1.5.5 for inline compression #5)
Eligibility:
- EMBSTR dropped from eligibility (6 locations) (Threads [S2.5] Compression encoder path #17, [design] R2.5.7 — transient decompression model + Appendix E #21)
- Decoupled from maxmemory-policy: universal write_age + idle_seconds checks; LFU-freq guard only when LFU active; knob renamed compression-lru-idle-seconds → compression-min-idle-seconds (Threads [S2.6] Compression decoder path #18, [S2.7] Compression write-path hook #19)
Retry guard: new incompressibleKeys{} side hashtable, dict-ID scoped with time fallback (Thread [S1.2/S1.3] Training sampler + BIO_COMPRESSION_TRAIN job #20)
Scope clarifications: io-threads explicitly rejected for decompression in v1 — new Appendix §C.7 (Thread fix(S1.2): NULL-guard server.db[j] iteration in compressionTrainCron #25); "block the client" clarified as BLPOP-style (Thread [S2.9 + S2.10] Sweep state machine + COMPRESSION SWEEP command #26)
Training flow rewritten (Threads [S2.9 / C1] Config plumbing: master-switch + active-sweeper + interval #29, [S2.9 / C2] Transient-view drain: master-switch-conditional dispatch (R2.5.7) #30):
- Main-thread iteration + copy into contiguous buffer (spliced across serverCron ticks)
- Bio runs ZDICT_trainFromBuffer on immutable main-thread-owned bytes
- Bio never touches robj, kvstore, or refcounts
- Dropped incorrect "zero copies" claim; transient ~10–16 MiB training buffer only during infrequent training runs

Thread-by-thread rationale is in DESIGN_TODO.md.

Tooling: the walkthrough used tools/fetch-pr-comments.sh + tools/normalize-pr-comments.py to curate threads; tools/post-pr-replies.py to round-trip replies and resolve threads via the GitHub API.

@ikolomi

Walk through 31 review threads sequentially; integrate all decisions into design artifacts. 22 threads were @ikolomi self-review, 9 from @GilboaAWS. Substantive design changes: - Default compression-max-value-size lowered 1 MiB -> 128 KiB - Full-sync replication RDB always uncompressed in v1 (new R2.6.8); disk RDB still compressed; wire negotiation deferred to v2 - Config surface split into 5 primary + 11 advanced knobs - SDS immutability invariant formalized via existing dbUnshareStringValue COW discipline; added R2.4.4-R2.4.6 audit + compression-cow-invariant.tcl as a merge blocker - New compression-aware benchmark suite (sect. 7.5) extending valkey-benchmark with --key-distribution, --value-size-distribution, --value-data and six canonical scenarios - EMBSTR dropped from eligibility in 6 locations - Hotness checks decoupled from maxmemory-policy: universal write-age and idle-seconds gates; LFU-freq only when LFU active; rename compression-lru-idle-seconds -> compression-min-idle-seconds - Retry-guard scoped by dict ID via new incompressibleKeys{} hashtable - Training flow rewritten: main-thread iteration copies samples into a contiguous buffer; bio runs ZDICT_trainFromBuffer on immutable main-thread-owned bytes; bio never touches robj/kvstore/refcounts - Appendix C.7 added: io-threads for decompression rejected in v1 - 'Block the client' clarified as BLPOP-style (main thread free) New artifacts: - implementation/plan.md: phased parallel-ownership implementation plan (7 subsystems, ~11 weeks with @ikolomi and @GilboaAWS in parallel) - summary.md: feature summary for reviewers - DESIGN_TODO.md: per-thread audit trail with status/decision/resolved_by - pr-feedback.json: machine-readable sidecar - tools/fetch-pr-comments.sh + normalize-pr-comments.py + post-pr-replies.py: walkthrough tooling (GitHub REST + GraphQL, paginated, cached) GitHub round-trip complete: per-thread replies posted and all 31 review threads resolved; top-level PR comment issuecomment-4420021714 links to DESIGN_TODO.md.

Addresses 5 of 6 review comments on the QSBR design. Comment #6 (`compressionJob.key` extra-lookup concern) is explicitly deferred to a follow-up PR per reviewer guidance. Comment #1 (line 428) and #5 (line 544) — drop language-comparison framing: Removed all references to Rust / `Arc<T>` / "memory-safe languages" / `shared_ptr` from §4.4 intro, the "Why QSBR" bullet list, and the §4.6 "Why the worker loads the active dict itself" paragraph. The rationale now stands on its own technical merit (decoupling the registry from worker hot paths; minimal worker contract; safe-directional failure modes) rather than via comparison to another language's type system. C with explicit protocols is the right tool for this problem; the comparison added rhetorical weight without adding signal. Comment #2 (line 326) — duplication with R2.11.4: §3.3 Separation invariants restated the worker contract that R2.11.4 already specifies authoritatively. Slimmed the §3.3 bullet to a one-liner that points at §2.11 R2.11.4 and §4.4. Eliminates drift risk between the two places. Comment #3 (line 439) — bound the retiring list, block on cap: Added new step 7 to the QSBR section explaining the cap interaction with R2.3.3. The retiring list is a subset of `dicts[]`, capped at `compression-dict-max-versions`. When grace-barrier draining cannot keep up (worker starvation, persistent `frame_refs > 0`), the cap is reached and BOTH training AND promotion are refused per R2.3.3: `LL_WARNING` log entry, `compression_dict_cap_reached` set in INFO, operator intervention required (raise cap or run COMPRESSION SWEEP). Comment #4 (line 449) — grace-barrier wake-up via cond_broadcast: The original step 6 proposed enqueueing barrier jobs into the SPMC inbox to force idle workers to advance generations. This doesn't actually work: under work-stealing semantics a single fast worker can drain all barrier jobs while siblings stay asleep on the cond var. Rewrote step 6 to use a wake-all primitive built on `pthread_cond_broadcast`, and added a "Wake-all primitive" paragraph to §4.6 that describes extending `mutexqueue.h` with two new APIs: a broadcast wake-all (for QSBR grace barriers, config changes, etc.) and a shutdown-signal variant (for pool teardown). Step 6 cross-references §4.6 for the mechanism. Comment #6 (line 513) — DEFERRED: Reviewer flagged that `compressionJob.key` (a `robj *` carried in the job) implies the main thread does an additional lookup at install time, doubling the per-write lookup cost. The reviewer explicitly tagged this as "follow up PR" — addressing it would require a redesign of the install-side data flow and is out of scope for the QSBR design change. Tracked as an open item; will be addressed before code lands for the install path (S2.7 in the implementation plan).

Removes one of the two time-based hot-key skip knobs. The eligibility predicate's LRU/noeviction branch now compares against a single threshold (`compression-min-idle-seconds`) instead of two. Why this matters ---------------- The dual-knob surface — `compression-settle-seconds` ("recent-write protection") and `compression-min-idle-seconds` ("recent-access protection") — was introduced via the Thread #18 walkthrough resolution on the theory that operators benefit from being able to express two different intents. In v1 reality this is documentation theater. Both knobs compare against the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field is touched on every read AND every write — gated only by `LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from write-recency from this single signal. The math always reduces to: eligible iff lru_idle_secs(o) >= max(settle, min_idle) Setting `settle=10, min_idle=120` is identical to the single-knob `min_idle=120`. Tested every scenario I could construct (different time scales, sweep cron sequences, heterogeneous workloads, forward- compat with v2) — none give the dual surface operationally distinct behavior in v1. The only non-trivial argument for keeping both was forward-compat: if v2 adds a per-object write-time field, the dual knob becomes meaningful. But adding a config in v2 is non-breaking; existing operators on v1 see no change. Removing later is harder than adding later. Plus the dual surface is an active footgun: operators tuning the two knobs differently expecting different effects get a confusing no-difference outcome. PR #1 Thread #3 specifically pushed back on "too many knobs" — that pressure applies here. Per YAGNI, ship v1 with the single knob. v2 reintroduces a write-time-specific knob non-breakingly when per-object write-time tracking lands. Code changes ------------ - src/server.h: remove `compression_settle_seconds` field. - src/config.c: remove the `createIntConfig` registration. - src/compression.c: drop the second `idle_secs >= settle` check in `compressionIsEligible`'s LRU branch. Updated the comment block to reflect single-signal reality. - src/unit/test_compression_eligibility.cpp: - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob max-wins behavior — no longer applicable). - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds` / `LruZeroThresholdsAcceptImmediately` with single-knob equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`, `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`). - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive` and rename to `LfuTimeKnobIsInactive`. - tests/unit/type/compression.tcl: drop the `compression-settle-seconds` config-default assertion; update comment from "Advanced (11)" to "Advanced (10)". Doc changes ----------- - detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has one comparison in the LRU/noeviction branch instead of two. The rationale paragraph below the predicate explains the v1 single- signal reality and the YAGNI motivation for dropping the second knob; future v2 reintroduction noted. - detailed-design.md §2.12 advanced config table: 11 → 10 knobs; `compression-settle-seconds` row removed; `compression-min-idle- seconds` description simplified. - detailed-design.md §7.1 transparency-mode harness config: drop the `--compression-settle-seconds 0` line so the harness doesn't pass an unknown option. - idea-honing.md Q6 baseline filter bullet: collapse the two-bullet LRU branch into a single bullet; add an _italicized rationale paragraph_ explaining why the second knob was dropped (preserves the historical thinking for future readers). - idea-honing.md Q6 consolidated predicate: matches detailed-design.md. - idea-honing.md Q6 config table: drop the `compression-settle-seconds` row. - idea-honing.md Q6a answer: rewrite to reflect single-knob reality with reference to "S2.2 implementation review" so future readers can trace this refinement chain (Thread #18 → Thread #19 → S2.2 refinement). - idea-honing.md §7.1 harness config: drop the `--compression-settle-seconds 0` line. - implementation/plan.md S2.2 description: simplify to "policy-aware hot-key skip" + the actual operator-facing knobs. - summary.md: update the eligibility table row + walkthrough- highlights bullet to reflect the policy-aware single-knob outcome. Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally unchanged — they capture decisions at a point in time. The walkthrough Thread #18/#19 resolutions stand as written; only the implementation interpretation in the live design docs is refined. Verified locally ---------------- - `make -j` builds clean. - `./runtest --single unit/type/compression` 10/10 passes (the Tcl fixture's config-default assertion was updated in lockstep with the C-side removal, so the integration test catches any drift between src/config.c and tests/unit/type/compression.tcl). Not verified locally (CI will validate): - gtest unit tests (no libgtest-dev locally). Test count delta ---------------- S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove the dual-knob exercise paths).

* docs(design): align eligibility predicate; document policy-aware hot-key skip Two corrections discovered while preparing S2.2 implementation: 1. The two design docs had drifted on minor wording (idea-honing.md said `obj->encoding == RAW` and `obj->refcount != SHARED`; detailed-design.md said `OBJ_ENCODING_RAW` / `OBJ_SHARED_REFCOUNT`). Predicates now match exactly across both docs, using the actual C constants from src/server.h. Per the agreed convention: keep the exact predicate inline in both docs (different audiences both need it readable in place) rather than a cross-reference. 2. The Thread #18/#19 walkthrough resolutions made a claim that doesn't match the existing Valkey source: "the existing LRU field already provides the signal for every eviction policy" / time-based checks "work uniformly across LRU, LFU, and noeviction" / `estimateObjectIdleTime()` is "a reliable read-hotness signal universally." Reading src/lrulfu.h: - LRU and noeviction policies: robj->lru is seconds-based. lru_idle_secs(o) returns real seconds. Time-based thresholds work as designed. - LFU policy: robj->lru encodes 16 bits "last eval time in minutes" + 8 bits approximate freq counter. There is no per-second access timestamp. lru_idle_secs(o) would misinterpret the bits. The function `estimateObjectIdleTime()` referenced by the design does not exist in current Valkey; the closest available helper is `lrulfu_getIdleness()` which returns `UINT8_MAX - freq` in LFU mode (a 0..255 freq-derived heuristic, NOT seconds). The fix is policy-aware checks, not policy-uniform: - LRU and noeviction: apply settle-seconds AND min-idle-seconds against lru_idle_secs(o). - LFU: skip the time-based thresholds entirely (the metric is wrong unit). Apply compression-lfu-threshold against the freq counter — already in the predicate as the LFU branch. The dual-knob operator surface (`compression-settle-seconds` and `compression-min-idle-seconds`) is preserved across modes; in LRU/noeviction both knobs apply to the same metric (since robj->lru is touched on every read AND write — v1 cannot distinguish source) so the effective threshold is `max(settle, min_idle)`. Operators get to express two intents. The Thread #18/#19 resolutions stand as written in DESIGN_TODO.md (audit trail; the decision to use the existing LRU field rather than add a new write-time field is unchanged); only the implementation interpretation in the live design docs is refined. Updated: - detailed-design.md §2.2 R2.2 predicate + new explanatory paragraph below it. - detailed-design.md §2.12 config table: settle-seconds and min-idle-seconds descriptions now correctly note "Inactive in LFU mode." - idea-honing.md Q6 baseline filter bullet (rewritten as "policy-aware" with sub-bullets per policy). - idea-honing.md Q6 consolidated predicate (now identical to detailed-design.md §2.2). - idea-honing.md Q6a answer: rewritten with the policy-aware framing; cross-references "S2.2 implementation review" so future readers can trace this refinement. * Inline compression: S2.2 — eligibility predicate Implements R2.2 / Q6 `compressionIsEligible(robj *o, const sds key)`, replacing the Phase 0 stub that returned 0 for every value. The predicate has six gates, evaluated in cheapest-first order so the master switch short-circuits early when the feature is disabled: 1. Master switch (server.compression_enabled). 2. Type + encoding gate. STRING values only (R2.2, Q6c). Of the four string encodings, only OBJ_ENCODING_RAW is a candidate: - INT — already memory-optimal. - EMBSTR — ≤44 B, header overhead erases any savings (Threads #17 / #21 explicitly excluded). - COMPRESSED — defense-in-depth no-op for double-compress. 3. Refcount gate. Shared RESP constants are never installed in a db (lookupKey asserts this); we mirror the assertion as a safety check. 4. Size bounds. compression-min-value-size (lower) prevents wasting CPU on values too small to recoup the per-value header (~16 B). compression-max-value-size (upper, 0 = disabled) caps worst-case sync-decompression latency on the main thread (~1 µs/KB). 5. Hot-key skip — POLICY-AWARE per the corrected R2.2: - LRU and noeviction: robj->lru is seconds-based. Apply both compression-settle-seconds (recent-write proxy) and compression-min-idle-seconds (recent-read proxy) against lru_getIdleSecs(o->lru). v1 cannot distinguish source (robj->lru is touched on read AND write); the dual surface lets operators express two intents that share an underlying signal. Effective threshold is max(settle, min_idle). - LFU: robj->lru encodes a freq counter (no per-second timestamp). Time-based knobs are inactive in this mode. Apply compression-lfu-threshold against the freq counter via lfu_getFrequency(), which mirrors the standard Valkey decay-on-read pattern (objectGetLFUFrequency in src/object.c). The signature is `robj *o` (not `const robj *`) because of this in-place decay. 6. Incompressible-keys retry guard. Stubbed as "always retry-eligible" in S2.2; S2.3 lands the side hashtable and wires compressionRetryEligible(key) here. Test coverage in src/unit/test_compression_eligibility.cpp (16 tests, auto-discovered by src/unit/Makefile's `wildcard *.cpp`): - Master switch off / on. - Each rejection branch: * non-STRING type * INT, EMBSTR, COMPRESSED encodings * shared refcount * size below min, above max - Size-bound boundary cases (at exact min, at exact max, max=0 disables upper bound). - LRU branch: * recent touch (idle < settle) — rejected * idle between settle and min_idle (max wins) — rejected * cold key (idle >= max(settle, min_idle)) — accepted * zero thresholds — accept immediately - noeviction policy: same code path as LRU per R2.2. - LFU branch: * freq at threshold — rejected (>= comparison) * freq above threshold — rejected * freq below threshold — accepted * time-based knobs are inactive even at INT_MAX values. The fixture saves and restores `server.compression_*` and `server.maxmemory_policy`, and re-syncs lrulfu's cached `is_using_lfu_policy` boolean via `lrulfu_updateClockAndPolicy()` on both setup and teardown so tests don't leak policy state into each other. Verified locally: - `make -j` builds clean. - `./runtest --single unit/type/compression` 10/10 passes (the Phase 0 integration fixture exercises feature-off semantics; with compression-enabled still 0 by default, eligibility is never consulted in the integration server). Not verified locally (CI will validate): - gtest unit tests on Linux/macOS/32-bit (no libgtest-dev locally). S2.3 (incompressible-keys hashtable) wires the last branch and ships its own gtest coverage. After that, S2.4–S2.10 wire the rest of the hot path. * docs(design): rewrite eligibility predicate's hot-key check in branched form Per review feedback during S2.2 review: the previous form encoded the policy split using short-circuit booleans — && (lfu_mode || lru_idle_secs(obj) >= compression-settle-seconds) && (lfu_mode || lru_idle_secs(obj) >= compression-min-idle-seconds) && (!lfu_mode || lfu_freq(obj) < compression-lfu-threshold) — which is logically correct but reads awkwardly. Three lines mention `lfu_mode` (twice unprimed, once primed); the reader has to mentally short-circuit twice to see that line 1+2 fire only in LRU/noeviction and line 3 fires only in LFU. It also looks at first glance like the predicate might be using `compression-lfu-threshold` as an LRU-mode threshold. Replaced with a branched helper that mirrors how the C implementation's if/else branches: && hot_key_check(obj) // policy-aware where hot_key_check(obj) is: if lfu_mode: lfu_freq(obj) < compression-lfu-threshold else: lru_idle_secs(obj) >= compression-settle-seconds AND lru_idle_secs(obj) >= compression-min-idle-seconds Same behavior; the implementation in src/compression.c (`compressionIsEligible()`) already uses this exact branching shape — the docs now match it visually. Updated: - detailed-design.md §2.2 R2.2 predicate. - idea-honing.md Q6 consolidated predicate. Both docs were already aligned (per the previous predicate-alignment commit); they remain identical with this rewrite. The explanatory paragraph below the §2.2 predicate (LRU vs LFU lru-field encoding) already covers the rationale and is unchanged. * Inline compression: drop compression-settle-seconds knob (YAGNI) Removes one of the two time-based hot-key skip knobs. The eligibility predicate's LRU/noeviction branch now compares against a single threshold (`compression-min-idle-seconds`) instead of two. Why this matters ---------------- The dual-knob surface — `compression-settle-seconds` ("recent-write protection") and `compression-min-idle-seconds` ("recent-access protection") — was introduced via the Thread #18 walkthrough resolution on the theory that operators benefit from being able to express two different intents. In v1 reality this is documentation theater. Both knobs compare against the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field is touched on every read AND every write — gated only by `LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from write-recency from this single signal. The math always reduces to: eligible iff lru_idle_secs(o) >= max(settle, min_idle) Setting `settle=10, min_idle=120` is identical to the single-knob `min_idle=120`. Tested every scenario I could construct (different time scales, sweep cron sequences, heterogeneous workloads, forward- compat with v2) — none give the dual surface operationally distinct behavior in v1. The only non-trivial argument for keeping both was forward-compat: if v2 adds a per-object write-time field, the dual knob becomes meaningful. But adding a config in v2 is non-breaking; existing operators on v1 see no change. Removing later is harder than adding later. Plus the dual surface is an active footgun: operators tuning the two knobs differently expecting different effects get a confusing no-difference outcome. PR #1 Thread #3 specifically pushed back on "too many knobs" — that pressure applies here. Per YAGNI, ship v1 with the single knob. v2 reintroduces a write-time-specific knob non-breakingly when per-object write-time tracking lands. Code changes ------------ - src/server.h: remove `compression_settle_seconds` field. - src/config.c: remove the `createIntConfig` registration. - src/compression.c: drop the second `idle_secs >= settle` check in `compressionIsEligible`'s LRU branch. Updated the comment block to reflect single-signal reality. - src/unit/test_compression_eligibility.cpp: - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob max-wins behavior — no longer applicable). - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds` / `LruZeroThresholdsAcceptImmediately` with single-knob equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`, `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`). - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive` and rename to `LfuTimeKnobIsInactive`. - tests/unit/type/compression.tcl: drop the `compression-settle-seconds` config-default assertion; update comment from "Advanced (11)" to "Advanced (10)". Doc changes ----------- - detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has one comparison in the LRU/noeviction branch instead of two. The rationale paragraph below the predicate explains the v1 single- signal reality and the YAGNI motivation for dropping the second knob; future v2 reintroduction noted. - detailed-design.md §2.12 advanced config table: 11 → 10 knobs; `compression-settle-seconds` row removed; `compression-min-idle- seconds` description simplified. - detailed-design.md §7.1 transparency-mode harness config: drop the `--compression-settle-seconds 0` line so the harness doesn't pass an unknown option. - idea-honing.md Q6 baseline filter bullet: collapse the two-bullet LRU branch into a single bullet; add an _italicized rationale paragraph_ explaining why the second knob was dropped (preserves the historical thinking for future readers). - idea-honing.md Q6 consolidated predicate: matches detailed-design.md. - idea-honing.md Q6 config table: drop the `compression-settle-seconds` row. - idea-honing.md Q6a answer: rewrite to reflect single-knob reality with reference to "S2.2 implementation review" so future readers can trace this refinement chain (Thread #18 → Thread #19 → S2.2 refinement). - idea-honing.md §7.1 harness config: drop the `--compression-settle-seconds 0` line. - implementation/plan.md S2.2 description: simplify to "policy-aware hot-key skip" + the actual operator-facing knobs. - summary.md: update the eligibility table row + walkthrough- highlights bullet to reflect the policy-aware single-knob outcome. Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally unchanged — they capture decisions at a point in time. The walkthrough Thread #18/#19 resolutions stand as written; only the implementation interpretation in the live design docs is refined. Verified locally ---------------- - `make -j` builds clean. - `./runtest --single unit/type/compression` 10/10 passes (the Tcl fixture's config-default assertion was updated in lockstep with the C-side removal, so the integration test catches any drift between src/config.c and tests/unit/type/compression.tcl). Not verified locally (CI will validate): - gtest unit tests (no libgtest-dev locally). Test count delta ---------------- S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove the dual-knob exercise paths). * Cleanup: untrack proposal-issue.md; mark S2.2 complete in plan.md Two small fixes to the previous commit's collateral: 1. proposal-issue.md was inadvertently committed via `git add -A` in the previous commit. The file is a working draft of the upstream issue (already tracked in the valkey-io issue tracker) and doesn't belong in the planning directory. Removing. 2. plan.md still showed S2.2 as `[ ]`. Implementation-complete state matches the S2.1 marking convention (`[x]` once the task ships); on merge to unstable the marking becomes definitive.

Two reviewer threads addressed: Thread #1 (T-3369017721) — production code carrying test concerns The drain handler had a `if (job->value == NULL)` branch that only existed to handle test-only jobs from testOnlyCompressionWorkersEnqueueRaw. Reviewer correctly pointed out that production code shouldn't carry test-only branches. Fix: replaced with serverAssert(job->value != NULL) at the top of the per-job loop. Production drain assumes every job has a real pinned robj; tests must extract their value=NULL jobs via testOnlyCompressionWorkersDrainOutbox before this drain runs. Side effect: removed the conditional `if (job->value != NULL)` guards around decrRefCount and the install branch — the top-of-loop assert means every code path can assume value is non-NULL. Thread #2 (T-3356207626) — design doc out of sync with implementation Design §4.6 still described the original version-counter approach for staleness detection (`uint64_t version` field on compressionJob, "if version counter moved, discard"). The implementation has used pointer equality + the incrRefCount-pin since S2.4 PR #13. Fix: updated §4.6 to: - compressionJob struct: drop `version`, drop `robj *key`, add `robj *value` (pinned via incrRefCount), and `sds src` and `int dbid` separately, matching the actual struct. - Concurrency notes: replaced the "version counter moved" bullet with the pointer-equality + ABA-safety reasoning, naming the incrRefCount-reserves-the-address invariant as the protection mechanism (same property explained in PR #18 review). Verified locally: - make -j2 -C src → clean - ./runtest --single unit/type/compression → 10/10 pass

5 gtest cases failed on build-32bit (and would on every test cell) with the new production-drain serverAssert(job->value != NULL): ASSERTION FAILED: compression_workers.c:591 'job->value != NULL' in: SingleJobRoundTrip, BurstOf256JobsOneWorker, BurstOf1024JobsFourWorkers, ResizeAcrossEnqueuedJobs, NetSavingsGuardRejectsIncompressible Root cause: the previous commit's reviewer-driven hardening (PR #19 review thread #1) made the production drain assert that every job has a non-NULL pinned robj. The premise was "tests use the testOnly drain to extract jobs before the production drain runs". That premise was wrong — many tests ALSO call compressionWorkersDrainOutbox directly to consume-and-dispose test-mode jobs (the drainUntil helper is the most-used path). Fix: add testOnlyCompressionWorkersDrainAndDispose(budget) — pulls jobs via the existing testOnlyCompressionWorkersDrainOutbox, frees them via testOnlyCompressionWorkersFreeJob, returns count. Migrate the test fixture's drainUntil helper and all 8 direct compressionWorkersDrainOutbox call sites in the test file to the new helper. Production drain stays clean — no test concerns. Reviewer thread #1 intent preserved. Verified locally: - make -j2 -C src SERVER_CFLAGS=-Werror → clean - ./runtest --single unit/type/compression → 10/10 pass

* [S2.7] Compression write-path hook Wires compressionEnqueueCandidate into dbAddInternal and dbSetValue, and replaces the TODO(S2.7) placeholder in the drain handler with a real install path. With this change, writes to eligible STRING values get queued for background compression and the result is installed back into the kvstore as an OBJ_ENCODING_COMPRESSED robj. The decoder (S2.6) is shipped but not yet wired into read paths (S2.8), so as long as compression-enabled stays no (default), behavior is unchanged. Once an operator turns the switch on, written values get compressed, but reads return the compressed bytes until S2.8 lands. Existing transparency tests verify no regression in the default-off configuration. Producer side (compression.c, db.c) Two seams in db.c — end of dbAddInternal and end of dbSetValue — call compressionEnqueueCandidate(key, value, db->id). The candidate function applies four guards: 1. Master switch (compression_enabled, via compressionIsEligible). 2. R2.2 eligibility (type/encoding/size/hot-key — also via predicate). 3. R2.1.5 active-dict check — saves an allocator round-trip when compression-enabled=yes but training hasn't completed. 4. incrRefCount(value) — pins the bytes for the worker AND reserves the robj address for the drain handler's pointer- equality stale check (ABA-safe per R2.4.4 + the lifetime discussion in PR #18). If the worker pool refuses (not started; future S2.11 inbox full), the pin is released immediately. RDB-load enqueue is deliberately skipped — TODO(S2.10): the sweep tick will rediscover RDB-loaded values without hammering the inbox during load. API change: compressionWorkersEnqueue Old: compressionWorkersEnqueue(sds key, int dbid, uint64_t version, sds src) New: compressionWorkersEnqueue(robj *value, int dbid) The new form requires a pinned robj; the worker reads objectGetVal(value) once at enqueue (captured into job->src) and never touches the robj afterwards (R2.11.4 intact). The drain handler uses job->value for the kvstore lookup and the pointer- equality stale check. The version field is gone — pointer equality, made ABA-safe by the pin, is sufficient. R2.4.4 explains why: holding incrRefCount(value) prevents the allocator from reusing the address while the job is in flight. Drain install (compression_workers.c) New compressionInstall() helper: 1. void **slot = kvstoreHashtableFindRef(db->keys, didx, key_sds); 2. If slot == NULL OR *slot != job->value: stale (overwrite, expire, or COW). Discard. 3. Else: createCompressedObject(OBJ_STRING, job->dst, job->dst_len); dbReplaceValue installs. 4. compressionRegistryIncRef(job->dict_id) on success. dbReplaceValue routes through dbSetValue(..., overwrite=0, ...), which does NOT call signalModifiedKey, moduleNotifyKeyUnlink, or signalDeletedKeyAsReady. Background compression is a storage-only change per R2.9.2 — no WATCH dirty_cas, no client-side-caching invalidations, no keyspace notifications. Pin released on every drain completion path (success, stale-discard, net-savings reject, ZSTD error, no-active-dict). Test-mode jobs (job->value == NULL) skip both install and decRef. Test migration The 15 existing test-fixture call sites passed raw sds + dummy version. Migrated to a new testOnlyCompressionWorkersEnqueueRaw(src, dbid) that sets job->value = NULL. Tests extract jobs via testOnlyCompressionWorkersDrainOutbox before the production drain runs, so production-only paths (install, decRef) are never reached by the value=NULL sentinel. No new gtest cases for the install path itself — that requires a fully-initialized server.db / kvstore that the unit-test environment doesn't construct. End-to-end coverage will come from the Tcl transparency harness once S2.8 wires the read path. TODO(S4.1) markers added at: - compressionInstall: compression_compressions_per_sec, EMA fold, compression_compressed_objects. - compressionEnqueueCandidate: compression_candidates_dropped_total when S2.11 lands (today the pool-not-started rejection is a config state, not back-pressure). Verified locally: - make -j2 -C src → clean (BUILD_ZSTD=yes default). - make -j2 -C src BUILD_ZSTD=no → clean. - ./runtest --single unit/type/compression → 10/10 pass. gtest unit tests not runnable locally; CI validates. Diff stat: .../implementation/plan.md | 4 +- src/compression.c | 35 +++- src/compression.h | 27 ++- src/compression_workers.c | 185 +++++++++++++++------ src/compression_workers.h | 56 +++---- src/db.c | 14 ++ src/unit/test_compression_workers.cpp | 31 ++-- 7 files changed, 244 insertions(+), 108 deletions(-) * [S2.7] PR #19 review: assert + design-doc alignment Two reviewer threads addressed: Thread #1 (T-3369017721) — production code carrying test concerns The drain handler had a `if (job->value == NULL)` branch that only existed to handle test-only jobs from testOnlyCompressionWorkersEnqueueRaw. Reviewer correctly pointed out that production code shouldn't carry test-only branches. Fix: replaced with serverAssert(job->value != NULL) at the top of the per-job loop. Production drain assumes every job has a real pinned robj; tests must extract their value=NULL jobs via testOnlyCompressionWorkersDrainOutbox before this drain runs. Side effect: removed the conditional `if (job->value != NULL)` guards around decrRefCount and the install branch — the top-of-loop assert means every code path can assume value is non-NULL. Thread #2 (T-3356207626) — design doc out of sync with implementation Design §4.6 still described the original version-counter approach for staleness detection (`uint64_t version` field on compressionJob, "if version counter moved, discard"). The implementation has used pointer equality + the incrRefCount-pin since S2.4 PR #13. Fix: updated §4.6 to: - compressionJob struct: drop `version`, drop `robj *key`, add `robj *value` (pinned via incrRefCount), and `sds src` and `int dbid` separately, matching the actual struct. - Concurrency notes: replaced the "version counter moved" bullet with the pointer-equality + ABA-safety reasoning, naming the incrRefCount-reserves-the-address invariant as the protection mechanism (same property explained in PR #18 review). Verified locally: - make -j2 -C src → clean - ./runtest --single unit/type/compression → 10/10 pass * [S2.7] Fix CI: remove erroneous & on server.db indexing build-32bit (and the 30+ downstream cells, all CI cells use -Werror): compression_workers.c:531:20: error: initialization of 'serverDb *' from incompatible pointer type 'serverDb **' [-Werror=incompatible-pointer-types] `server.db` is `serverDb **` (array of pointers, one per DB). So `server.db[i]` is already `serverDb *` — the address-of operator was redundant and produced `serverDb **`. Fix: drop the `&`. Matches the pattern used everywhere else in the codebase (db.c, server.c, etc.). Local make didn't catch this — the default SERVER_CFLAGS doesn't include -Werror. CI does. Built locally with `make SERVER_CFLAGS=-Werror` to confirm clean. * [S2.7] Fix CI: tests must use testOnly drain for value=NULL jobs 5 gtest cases failed on build-32bit (and would on every test cell) with the new production-drain serverAssert(job->value != NULL): ASSERTION FAILED: compression_workers.c:591 'job->value != NULL' in: SingleJobRoundTrip, BurstOf256JobsOneWorker, BurstOf1024JobsFourWorkers, ResizeAcrossEnqueuedJobs, NetSavingsGuardRejectsIncompressible Root cause: the previous commit's reviewer-driven hardening (PR #19 review thread #1) made the production drain assert that every job has a non-NULL pinned robj. The premise was "tests use the testOnly drain to extract jobs before the production drain runs". That premise was wrong — many tests ALSO call compressionWorkersDrainOutbox directly to consume-and-dispose test-mode jobs (the drainUntil helper is the most-used path). Fix: add testOnlyCompressionWorkersDrainAndDispose(budget) — pulls jobs via the existing testOnlyCompressionWorkersDrainOutbox, frees them via testOnlyCompressionWorkersFreeJob, returns count. Migrate the test fixture's drainUntil helper and all 8 direct compressionWorkersDrainOutbox call sites in the test file to the new helper. Production drain stays clean — no test concerns. Reviewer thread #1 intent preserved. Verified locally: - make -j2 -C src SERVER_CFLAGS=-Werror → clean - ./runtest --single unit/type/compression → 10/10 pass

Inline compression design (initial draft)

290e0e1

Signed-off-by: ikolomi <ikolomin@amazon.com>

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

ikolomi commented Apr 30, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

GilboaAWS and others added 2 commits May 3, 2026 12:41

Update architecture.md - Fix Mermaid Parsing error

d1953b5

Signed-off-by: Gilboab <97948000+GilboaAWS@users.noreply.github.com>

Merge pull request #2 from GilboaAWS/patch-1

a02c063

Update architecture.md - Fix Mermaid Parsing error

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

GilboaAWS reviewed May 5, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

GilboaAWS reviewed May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/design/detailed-design.md Outdated

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/design/detailed-design.md

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/design/detailed-design.md

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/design/detailed-design.md

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi commented May 6, 2026

View reviewed changes

Comment thread .agents/planning/realtime-data-compression/idea-honing.md Outdated

ikolomi merged commit aaf42fd into unstable May 12, 2026

ikolomi mentioned this pull request Jun 2, 2026

docs(planning): sweep stale references after PR #14 / PR #15 #16

Merged

Conversation

ikolomi commented Apr 27, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ikolomi commented May 11, 2026

Design walkthrough complete — 31/31 review threads addressed

Substantive design changes merged during the walkthrough

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants