[S2.9 + S2.10] Sweep state machine + COMPRESSION SWEEP command#26
Closed
ikolomi wants to merge 11 commits into
Closed
[S2.9 + S2.10] Sweep state machine + COMPRESSION SWEEP command#26ikolomi wants to merge 11 commits into
ikolomi wants to merge 11 commits into
Conversation
Wires the sweep tick into compressionCron (S2.10 — the cron wiring itself was already done by PR #20) and adds the operator-driven COMPRESSION SWEEP command (S2.9). The sweep walks the keyspace shard-at-a-time across cron ticks, applying either the compress eligibility predicate (default direction=compress) or the in-place permanent-decompress (direction=decompress, used for the explicit drain step after disabling the feature per R2.1.4). Architecture mirrors @GilboaAWS's compressionTrainCron pattern from PR #20: monotime-budgeted state machine spliced across cron ticks with a persistent (DB index, kvstore cursor) cursor across ticks. All new logic lives in a self-contained pair of files (src/compression_sweep.{c,h}) — same separation as compression_train.{c,h}. Triggers (any of): - master-switch transition compression-enabled no → yes (R2.1.3) auto-initiates a compress-direction sweep via a new apply hook on the bool config (applyCompressionEnabled in src/config.c). - explicit `COMPRESSION SWEEP [direction=compress|decompress]` operator command (R2.1.4 covers the disable-side drain via direction=decompress). Per-tick budget: derived from compression-sweep-max-cpu-pct (already registered in PR #4 / phase 0; default 25) and server.hz (default 10) — yields ~25 ms per tick at defaults. Floor at 10 µs to guarantee progress on a misconfigured pacing knob. Single-flight model: only one sweep at a time. Overlapping requests during an in-flight sweep are dropped (the operator can re-issue). This matches R2.1.4's 'operator owns the timing' property and avoids a more complex queue + priority model that v1 doesn't need. The sweep callback either calls compressionEnqueueCandidate (which internally applies the R2.2 eligibility predicate, the active-dict guard from R2.1.5, and the pin from R2.4.4) for compress-direction, or calls compressionPermanentlyDecompress for decompress-direction. Neither path calls signalModifiedKey (R2.9.2 — storage change, not a logical change). Includes the same `if (server.db[j] == NULL) continue;` guard that PR #25 added to compression_train.c — server.db is sparse, slots 1..dbnum-1 are NULL until materialized via createDatabaseIfNeeded. Also extends the COMPRESSION command dispatcher (src/compression.c) with the SWEEP subcommand, replaces the Phase 0 stub `compressionSweep(client *c, int direction)` with a real implementation, updates the HELP text, and adds the new src/commands/compression-sweep.json metadata (commands.def is auto-regenerated by utils/generate-command-code.py). Out of scope: - S2.11 — bounded inbox + back-pressure counters. Production safety net; lands in a follow-up PR once the sweep produces sustained load. - Drift evaluation hook for the training-side retrain trigger (S1.4) — will be wired by @GilboaAWS when S1.4 lands. Verified locally: - make -j2 -C src SERVER_CFLAGS=-Werror — clean (BUILD_ZSTD=yes, default). - make -C src distclean && make -j2 -C src SERVER_CFLAGS=-Werror BUILD_ZSTD=no — clean. - ./runtest --single unit/type/compression — all tests pass (including PR #25's regression test). - Manual smoke test: enable feature, observe auto-triggered sweep in log; explicit COMPRESSION SWEEP and SWEEP direction=decompress both observed in log; HELP output shows the new subcommand; server stays up across all sweeps. Diff stat: cmake/Modules/SourceFiles.cmake | 1 + src/Makefile | 1 + src/commands.def | 23 +++ src/commands/compression-sweep.json | 30 ++++ src/compression.c | 68 ++++++++- src/compression_sweep.c | 294 ++++++++++++++++++++++++++++ src/compression_sweep.h | 80 ++++++++++ src/config.c | 33 +++- 8 files changed, 524 insertions(+), 6 deletions(-)
PR #26 ships the sweep state machine and the COMPRESSION SWEEP command (S2.9), plus the sweep tick wired into compressionCron (S2.10 — the cron wiring itself was already done by PR #20). Updates the @ikolomi-track checklist in implementation/plan.md to mark both sub-tasks done and capture the as-implemented scope. After PR #26 merges, the @ikolomi-track items remaining are: - S2.11 — bounded inbox + back-pressure counters - S5.1 — valkey-benchmark extensions - S5.2 — canonical scenarios harness - S5.3 — perf dashboard S5.x is operationally blocked on @GilboaAWS's S1.4 (drift-retrain trigger) — there's nothing to benchmark until training closes the loop and live compression is real.
Adds src/unit/test_compression_sweep.cpp — gtest-level integration test for the keyspace sweep state machine. Drives the full 'enable feature → sweep → values get compressed → decompress sweep → values back to RAW' round-trip in a single test, inspecting robj->encoding directly rather than going through the (not-yet- implemented) OBJECT ENCODING / INFO compression observability surfaces. Test fixture mirrors test_compression_train.cpp from PR #20: fake server.db with a kvstore, registry + sweep + workers initialized, bio threads spawned (via bioInit). Bypasses real training via the installSyntheticDict() helper (replicated from test_compression_workers.cpp's pattern; both anonymous-namespace copies have internal linkage so they don't collide at link time). Tests: - CompressDecompressRoundTrip (headline) — half eligible, half too-small, verify only eligible keys flip to COMPRESSED on the compress sweep, then ALL keys back to RAW on the decompress sweep. Also asserts the live atomic counters (compressionGetTotalUncompressedBytes / compressionGetTotalCompressedBytes) climb during compression and drain to 0 after decompression. - EmptyKeyspaceCompletes — empty kvstore, sweep finishes in <5 iterations; no enqueues. - NoActiveDictGracefulNoOp — registry empty, eligibility holds but compressionEnqueueCandidate's active-dict guard short-circuits; no values get compressed; no crash. - MasterSwitchOffMakesCronNoOp — request queued, but cron tick early-returns on !compression_enabled; re-enabling picks up the pending request. - SparseDbIteratesPastNullSlots — server.dbnum=16 with only db[0] and db[5] populated, rest NULL. Same NULL-deref class PR #25 fixed in totalDbKeys; the test verifies my matching NULL guard in advanceScan handles the sparse-DB iteration. - SingleFlightRejectsConcurrentRequest — pacing throttled to 1% so the first sweep takes >1 tick; second request returns 0 while first is in flight; restoring pacing lets the first finish. All tests #ifdef USE_ZSTD-gated since they require a working ZSTD encoder + the synthetic-dict helper. Test runtime: <1 second total (no real training, no real cron, no bio job dependency on the wall clock). Diff stat: src/unit/test_compression_sweep.cpp | 501 +++++++++++++++++++++ 1 file changed, 501 insertions(+)
clang-format-check failures: trailing-comment column alignment in
the compressionSweepState struct; spacing in the
COMPRESSION_SWEEP_DIR_COMPRESS macro; one-line robj factory
signature; switch-case alignment; extra space on lfu_threshold
config-init line.
build-32bit test failures: 4 of 6 tests in test_compression_sweep
flaked on the slower 32-bit CI runner. Two distinct issues:
1. driveAndDrain's post-completion drain budget was 5 polls × 500
µs = 2.5 ms — enough on amd64 (workers fast) but not on
32-bit. Replaced with 'drain until 10 consecutive empty
polls' (~5 ms quiescence window) bounded by the existing
deadline. Affects CompressDecompressRoundTrip,
SparseDbIteratesPastNullSlots, MasterSwitchOffMakesCronNoOp.
2. SingleFlightRejectsConcurrentRequest assumed 500 keys with
pct=1% / hz=10 (1 ms/tick budget) was tight enough to leave
the sweep in SCANNING after 1 tick. On 32-bit the first
tick still finished the entire scan. Tighten the budget by
also setting hz=500 (20 µs/tick), which forces the sweep
to take many ticks regardless of CPU class. Restore
hz=10 + pct=100 before final drain.
Rewrites R2.1.3 / R2.1.4 to drop the asymmetric auto-trigger on `compression-enabled` toggles. Existing in-memory state is now preserved across both directions of the toggle; conversion is operator-driven via explicit `COMPRESSION SWEEP direction=…` commands. Adds R2.1.6 with the rationale and the precedent table — every existing 'stop doing X' master switch in Valkey (appendonly, rdbcompression, notify-keyspace-events, replica-read-only, lazyfree-lazy-*, maxmemory-policy, protected-mode) preserves state on toggle and only governs future behavior. Compression is the only one that previously violated this; the asymmetric design also created an irreversible-without- extra-work fat-finger trap (CONFIG SET compression-enabled yes auto-runs a sweep that toggling back doesn't undo). Extends R2.5.7 with the two-mode drain in `compressionBeforeSleep`: default 'restore' mode preserves existing compressed state across reads (consistent with R2.1.6); 'permanent-decompress' mode fires when an operator-initiated decompress sweep is in flight, letting the iteration drain cooperate with the sweep instead of fighting it. The mode is keyed on `compressionSweepIsScanning() && compressionSweepCurrentDirection() == DECOMPRESS` — the operator's expressed intent — not on the master-switch state (which is ambiguous between 'temporary disable' and 'permanent disable'). Adds new §3.4 with the authoritative sweep state-machine table covering all (state × trigger × master-switch) cells. R2.1.3, R2.1.4, R2.1.6, R2.5.7, R2.9.2, and §6.6 all read against this table. Adds an Appendix D entry for v2 opt-in `compression-auto-sweep-on-enable` / `compression-auto-sweep-on-disable` — preserves the affordance for operators who genuinely want eager retroactive conversion, without baking it into the v1 contract. Refines via S2.9 review (PR #26).
…2.5.7
Implements the P1 semantics from the design doc revision in the
preceding commit. Five surgical changes plus a side bug fix:
1. Drop `applyCompressionEnabled` apply hook (config.c). The bool
master switch is now a plain config flag with no side effects on
toggle. The auto-sweep affordance is moved to an opt-in v2 config
per Appendix D. Matches the `rdbcompression` config pattern.
2. Direction-conditional master-switch guard in `compressionSweep`
command handler (compression.c). `direction=COMPRESS` requires
the master switch on (otherwise rejected with a clear message
pointing the operator at the only allowed direction).
`direction=DECOMPRESS` is always allowed — it's the canonical
drain path R2.1.4 calls out and operators MUST be able to invoke
it after disabling.
3. Lift the unconditional master-switch early-return from
`compressionSweepCron` (compression_sweep.c). Replace with
state-machine-table semantics (§3.4):
- IDLE.requested(COMPRESS) + master_off → defensive clear (the
command handler should already have rejected this; reaching
cron means a future caller bypassed it)
- SCANNING(COMPRESS) + master_off observed at cron entry →
abort, log NOTICE, return to IDLE. Honors R2.1.4's 'new writes
stop being compressed' for background work too.
- SCANNING(DECOMPRESS) + master_off → continue scanning. The
drain path is independent of the master switch.
Worker-pool jobs already enqueued by sweepScanCallback before the
abort still complete and install — workers don't observe the
master switch (R2.4 / R2.11.4); the sweep state machine is the
authoritative lever.
4. Add `compressionSweepCurrentDirection()` helper
(compression_sweep.h + .c). Returns the in-flight sweep direction
when SCANNING, 0 otherwise. Single int read, no atomics.
Consumed by the two-mode drain in compressionBeforeSleep (#5).
5. Two-mode drain in `compressionBeforeSleep` (compression.c). Mode
is selected once per invocation:
- 'restore' (default): pointer-swap each side-map entry back
to compressed (current behavior; R2.1.6 'master switch
governs future behavior, existing state preserved').
- 'permanent-decompress': fires when
`compressionSweepIsScanning() && direction == DECOMPRESS`.
Each side-map entry is finalized as RAW (compressed buffer
released via the new releaseCompressedBuffer helper; temp
sds stays installed). Cooperates with the operator-initiated
drain.
6. Side improvement: extracted `releaseCompressedBuffer` helper
that decodes the per-value header, decRef's the dict frame-ref,
reverses the compression accounting, and frees the buffer. Used
by:
- compressionPermanentlyDecompress (refactored to use it)
- compressionBeforeSleep permanent-decompress mode (new)
- discardTransientEntry (BUG FIX — was previously `zfree`ing
the buffer without decRef'ing the dict or reversing
accounting; the dict-frame-ref leaked and
compression_total_*_bytes drifted whenever the side-map
discarded an orphaned entry)
Tests:
- replaced `MasterSwitchOffMakesCronNoOp` (which asserted the
pre-fix asymmetric behavior) with five P1-aligned tests:
• NoAutoTriggerOnEnable
• CompressRequestClearedAtCronWhenDisabled
• DecompressSweepRunsWhenDisabled
• InFlightCompressSweepAbortsOnDisable
• InFlightDecompressSweepContinuesOnDisable
- added BeforeSleepCooperatesWithDecompressSweep covering the
two-mode drain end-to-end (compress, materialize transient view,
tight-pace decompress sweep, beforeSleep, verify val_ptr is the
temp sds and NOT swapped back to the compressed buffer)
- added SweepCurrentDirectionReportsState exercising the new helper
Verified locally: server build clean (BUILD_ZSTD=yes default and
BUILD_ZSTD=no); Tcl unit/type/compression 11/11 pass; runtime smoke
of all five expected behaviors against a live server (toggle+sweep
matrix). Gtest unit tests not runnable locally (libgtest-dev not
installed); CI validates.
ASan-detected leak: compressionWorkersStop's leftover-drain path (inbox + outbox) was zfree'ing job structs and dst buffers without calling decrRefCount(job->value) on the caller's pin. Existing comment marked this as 'accepted at shutdown — process is exiting', which holds for production but not for the gtest binary that runs many tests in one process. The newly-added InFlightCompressSweepAbortsOnDisable test exposed the leak by triggering the abort path mid-sweep, leaving in-flight jobs unconsumed at TearDown. Fix: call decrRefCount(job->value) in both leftover-drain loops (inbox + outbox) when value != NULL. The NULL case covers the test-only raw-enqueue path (testOnlyCompressionWorkersEnqueueRaw). Also: harden InFlightCompressSweepAbortsOnDisable to drain the worker pool to quiescence after the abort instead of relying on Stop-time cleanup. Uses driveAndDrain() — sweep is already IDLE post-abort so phase-1 (cron-driven) exits immediately; phase-2 (wait-for-quiescence) catches in-flight jobs as workers finish. This makes the test's state assertion reflect a fully drained state, not the partial state at the moment of the abort.
ikolomi
commented
Jun 10, 2026
| * R2.1.4) for every visited string value. | ||
| * | ||
| * Triggers (any of): | ||
| * - master-switch transition `compression-enabled no → yes` |
Owner
Author
There was a problem hiding this comment.
isnt it outdated? does master-swich transition indeed triggers sweeper by design?
Three header/inline comments still described the pre-P1 (asymmetric) sweep behavior: - compression_sweep.h file docstring: claimed master-switch no→yes transition auto-triggers a compress sweep. Removed per R2.1.6. Rewrote the docstring to describe the v1 behavior: triggered exclusively by the COMPRESSION SWEEP command, with master-switch toggles preserving in-memory state. Added a pointer to the §3.4 state-machine table for the authoritative per-cell behavior. Clarified the layered single-flight model (low-level compressionSweepRequest silently drops; command handler renders the friendly error message and the direction-conditional master-switch guard). Documented why v1 has no abort affordance for SCANNING(DECOMPRESS) (idempotent). - compression_sweep.h compressionSweepCron description: said 'No-op when the master switch is off OR no sweep is requested OR a previous sweep is already scanning'. The first clause is the bug we just fixed; replaced with the §3.4 case-by-case summary. - compression_sweep.h compressionSweepIsScanning description: was marked 'Test-only' but production code (the transient-view drain in compressionBeforeSleep) now calls it as the mode selector for restore vs. permanent-decompress. Re-described as a public introspection helper used by both production and tests. - compression_sweep.h compressionSweepRequest description: added an explicit note that this API doesn't enforce master-switch / direction constraints — those live one layer up in the command handler. Internal callers must guard before reaching here. - compression.c compressionCron inline comment: said sweep is 'operator-triggered via COMPRESSION SWEEP, or auto-triggered on the master-switch no→yes transition'. Removed the auto-trigger half. - compression_workers.h compressionWorkersStart docstring: referenced the 'enable without auto-sweep' pattern that R2.1.3 used to suggest (set compression-threads 0 before flipping the master switch). With P1 there's no auto-sweep on enable, so the workaround is moot. Reworded to describe the surviving use case (pause new compression without disabling the feature). No code change; comment-only fixes. Build clean.
Two real behavior bugs in the sweep state machine, surfaced by reading the header docstrings I rewrote in 0c74747 against the compression_sweep.c implementation: 1. Double-log on abort. compressionSweepCron's abort path emits serverLog(NOTICE, 'Compression sweep aborted: master switch ...') and THEN calls enterIdle, which itself unconditionally logs serverLog(NOTICE, 'Compression sweep: completed ...'). Operators would see two NOTICE-level lines for one event, the second one incorrectly claiming completion of a sweep that was actually aborted. Fix: enterIdle takes an 'aborted' flag; suppress the completion log on the abort path. The abort caller's own log is the authoritative line. 2. Same-direction in-flight request rejected. The §3.4 state-machine table specifies SCANNING(D) + request(D) (same direction) → SCANNING(D), +OK reply (idempotent — the in-flight sweep already covers the requested intent). The previous implementation rejected ALL second requests via single-flight at the compressionSweepRequest level, and the command handler emitted the friendly error for both the different-direction and the same-direction case. Fix: the command handler now detects the same-direction case BEFORE calling compressionSweepRequest, via compressionSweepIsScanning() && compressionSweepCurrentDirection() == direction, and replies +OK without disturbing the in-flight sweep. compressionSweepRequest stays simple (rejects all second requests at the low level); the layered model is documented in both the header and the inline doc. Plus three doc updates to keep the inline comments aligned with the new header (compression_sweep.h was rewritten in 0c74747): 3. compression_sweep.c file-level docstring rewritten to mention the layered single-flight model (compressionSweepRequest is the low-level state-machine entry; the COMPRESSION SWEEP command handler adds master-switch + idempotency + error rendering on top), plus the abort path (cron observes master switch directly to abort an in-flight COMPRESS sweep on yes→no). 4. compressionSweepRequest inline doc mirrors the header's note that this API does NOT enforce master-switch / direction constraints — those live one layer up in the command handler. Test: SingleFlightRejectsConcurrentRequest extended with the same- direction case — verifies that compressionSweepRequest still returns 0 at the low level (single-flight regardless of direction) AND that the helper-based predicate (compressionSweepIsScanning() && compressionSweepCurrentDirection() == direction) the handler uses resolves correctly. The handler itself is exercised by smoke tests against a running server. Verified: BUILD_ZSTD=yes default build clean; runtest unit/type/compression all green.
R2.1.3 incorrectly described the no-active-dict path as 'training
is triggered (R2.3.5) and writes are buffered until a dict is
promoted'. Two errors:
1. There is no buffer. compressionEnqueueCandidate's active-dict
guard returns early when compressionRegistryActive() is NULL;
nothing is queued anywhere. The value just stays RAW in the
kvstore. There is no in-memory or in-pool buffer that holds
writes pending dict promotion.
2. Training is NOT triggered by the master-switch transition.
Training has its own R2.3.5 triggers (DB total-keys count
threshold, drift detection, manual COMPRESSION TRAIN). The
master-switch transition is independent — it just stops
short-circuiting the write-path candidate enqueue. If the
eligibility-key threshold has already been met before the
toggle, training was already triggered; if not, the master
switch turning on doesn't change that.
Rewrote R2.1.3 to describe the actual behavior:
- Eligibility predicate starts returning true.
- signalModifiedKey routes qualifying values to the candidate
path on every write.
- Candidate path no-ops at the active-dict guard when no dict
exists — value stays RAW.
- When a dict is eventually promoted, FUTURE writes get
compressed; values written during the no-dict window stay
RAW unless they're written again.
- Existing values are never retroactively compressed; operator
runs explicit sweep when they want the conversion.
Also expanded the operator-facing 'every eligible existing value'
clause to enumerate the three categories (pre-existing RAW values,
values written during the no-dict window, and values that became
RAW via in-place permanent-decompress paths) so it's clear what
the explicit compress sweep covers.
This was referenced Jun 11, 2026
ikolomi
added a commit
that referenced
this pull request
Jun 14, 2026
…model
Replace the bool master switch (compression-enabled yes|no) with the
3-state enum (compression-master-switch: compression / decompression
/ off, default off) plus a 2-state sweeper enum
(compression-sweeper: enabled / disabled, default disabled) plus a
sweeper-interval knob (compression-sweeper-interval, default 0) plus
a manual force command (COMPRESSION SWEEP FORCE).
The operator declares the desired DB state; the sweeper drives the
keyspace toward it. This matches Valkey's prevailing declarative
config style (maxmemory + maxmemory-policy, save ..., appendonly)
and makes the three mechanics orthogonal:
1. Master switch: governs future-write eligibility AND the
transient view's drain mode.
2. Sweeper: drives the keyspace toward the declared state.
3. Transient view (R2.5.7, updated separately): drain mode is
gated on master-switch state.
R2.1.1 \u2014 master-switch enum semantics. The 3 values + what each
means for writes / reads / dict lifecycle.
R2.1.2 \u2014 sweeper enum semantics. enabled drives toward target
state; disabled means no automatic work; with master=off the
sweeper has no direction and idles. Decompression sweeps run
on the main thread (no worker pool).
R2.1.3 \u2014 sweeper-interval semantics. 0 = single pass after
master-switch direction change, then idle. Positive N = sleep
N seconds between passes. Default 0 minimizes CPU on idle
keyspaces; operators with high churn opt in.
R2.1.4 \u2014 COMPRESSION SWEEP FORCE command. One-shot pass
regardless of sweeper config and interval. Uses master switch's
current direction; rejected if master=off. Allowed even with
sweeper=disabled (the operator's catch-up affordance).
R2.1.5 \u2014 master-switch transitions including auto-retire of
the active dict on transitions to decompression or off. The dict
cannot accept new frame references in the new state, so retirement
is safe (workers are protected by QSBR). With
master=decompression + sweeper=enabled running to completion,
the dict's memory is auto-reclaimed without operator
intervention.
R2.1.6 \u2014 configuration validation. All combinations allowed
(operator owns controls + Valkey reports). Two non-functional
combinations get one-time LL_WARNING logs:
- master=off + sweeper=enabled (sweeper idles, documented)
- master=compression + threads=0 (write enqueues + sweep
enqueues both drop; sweeper skips passes with rate-limited
warning)
R2.1.7 \u2014 third state preserved (master=compression but no
active dictionary). Same content as the prior R2.1.5; renumbered.
Replaces the 5 R2.1.x requirements that were on unstable. Commit
touches detailed-design.md only; code changes in PR-C.
Plan reference: PR-B in the orthogonal-mechanics design rewrite.
PR #26's R2.1.6 (master-switch precedent table) and §3.4 (sweep
state machine table) are not part of unstable today and are not
touched here.
ikolomi
added a commit
that referenced
this pull request
Jun 14, 2026
…itch + sweeper model (#28) * design(§2.1): rewrite master-switch + sweeper surface as declarative model Replace the bool master switch (compression-enabled yes|no) with the 3-state enum (compression-master-switch: compression / decompression / off, default off) plus a 2-state sweeper enum (compression-sweeper: enabled / disabled, default disabled) plus a sweeper-interval knob (compression-sweeper-interval, default 0) plus a manual force command (COMPRESSION SWEEP FORCE). The operator declares the desired DB state; the sweeper drives the keyspace toward it. This matches Valkey's prevailing declarative config style (maxmemory + maxmemory-policy, save ..., appendonly) and makes the three mechanics orthogonal: 1. Master switch: governs future-write eligibility AND the transient view's drain mode. 2. Sweeper: drives the keyspace toward the declared state. 3. Transient view (R2.5.7, updated separately): drain mode is gated on master-switch state. R2.1.1 \u2014 master-switch enum semantics. The 3 values + what each means for writes / reads / dict lifecycle. R2.1.2 \u2014 sweeper enum semantics. enabled drives toward target state; disabled means no automatic work; with master=off the sweeper has no direction and idles. Decompression sweeps run on the main thread (no worker pool). R2.1.3 \u2014 sweeper-interval semantics. 0 = single pass after master-switch direction change, then idle. Positive N = sleep N seconds between passes. Default 0 minimizes CPU on idle keyspaces; operators with high churn opt in. R2.1.4 \u2014 COMPRESSION SWEEP FORCE command. One-shot pass regardless of sweeper config and interval. Uses master switch's current direction; rejected if master=off. Allowed even with sweeper=disabled (the operator's catch-up affordance). R2.1.5 \u2014 master-switch transitions including auto-retire of the active dict on transitions to decompression or off. The dict cannot accept new frame references in the new state, so retirement is safe (workers are protected by QSBR). With master=decompression + sweeper=enabled running to completion, the dict's memory is auto-reclaimed without operator intervention. R2.1.6 \u2014 configuration validation. All combinations allowed (operator owns controls + Valkey reports). Two non-functional combinations get one-time LL_WARNING logs: - master=off + sweeper=enabled (sweeper idles, documented) - master=compression + threads=0 (write enqueues + sweep enqueues both drop; sweeper skips passes with rate-limited warning) R2.1.7 \u2014 third state preserved (master=compression but no active dictionary). Same content as the prior R2.1.5; renumbered. Replaces the 5 R2.1.x requirements that were on unstable. Commit touches detailed-design.md only; code changes in PR-C. Plan reference: PR-B in the orthogonal-mechanics design rewrite. PR #26's R2.1.6 (master-switch precedent table) and §3.4 (sweep state machine table) are not part of unstable today and are not touched here. * design(§2.5): transient view drain mode gated on master switch state R2.5.7 previously described a single-mode drain: beforeSleep always restores the compressed form. With the new declarative model (compression-master-switch: compression / decompression / off), the drain mode must follow the master switch: - master = compression or off: restore mode (current behavior; compressed form preserved across the iteration boundary). - master = decompression: permanent-decompress mode (compressed buffer released; temp sds installed as the value's val_ptr; encoding stays RAW). The permanent-decompress branch exists because in decompression mode the operator's intent is "drain the DB". A transient view that restores the compressed form on every read-touched key would oscillate the same key between compressed and RAW every iteration — defeating the drain. This is the ONLY cross-mechanism dependency between master switch and transient view (per R2.1's "three orthogonal mechanics" framing — mechanics 1 and 3 share this one coupling, by design). The section calls this out explicitly so future readers know it's deliberate, not accidental. Both modes share the underlying release-and-decRef cleanup (releaseCompressedBuffer for the compressed buffer; same discardTransientEntry helper); only the restore-vs-leave-as-RAW step differs. Also updated R2.5.6 to reference the new master-switch + sweeper combination instead of "COMPRESSION SWEEP direction=decompress" as the operator's drain affordance. * design(§2.10, §2.12): INFO fields + config table for declarative model §2.10.1 INFO fields: - Renamed compression_enabled → compression_master_switch (enum, reports compression / decompression / off per R2.1.1). - Added compression_sweeper (enum, reports enabled / disabled per R2.1.2). - Added compression_sweeper_interval (seconds, reports the R2.1.3 setting). - Added compression_sweeper_state (enum, reports the runtime sweeper engine state: idle / scanning / sleeping / disabled). Distinct from compression_sweeper because the configured switch and the runtime engine state aren't always the same — e.g. with compression-sweeper=enabled but master=off, the engine is idle (no direction); with sweeper=enabled + master=compression actively running, the engine is scanning. - Existing fields (compression_state, dict counters, sweep back-pressure counters, ratio counters) all preserved unchanged. §2.12 config table: - Primary section grew from 5 → 6 knobs: • compression-master-switch (enum, default off) — replaces the former compression-enabled bool entry • compression-sweeper (enum, default disabled) — new • compression-threads kept; description annotated to flag "only relevant in master=compression" per R2.1.6 • size bounds and dict-size unchanged - Advanced section grew from 11 → 12: • compression-sweeper-interval (seconds, default 0) — new • all other advanced knobs unchanged Together these surfaces give the operator the declarative-model view: declare master state, optionally run the sweeper to converge, observe progress via the runtime sweeper_state and the back-pressure counters. * design(§4.5 + §1, §2, §7 sweep): COMPRESSION command surface + cleanup §4.5 COMPRESSION subcommand container: - Removed COMPRESSION ENABLE / COMPRESSION DISABLE rows. The new 3-state master switch (compression / decompression / off) doesn't map cleanly to enable/disable verbs; operators use CONFIG SET directly. Documented this rationale in the section text plus a reference to the precedent (maxmemory-policy, appendfsync are other tri-state Valkey configs without verb aliases). - Replaced "COMPRESSION SWEEP [direction=compress|decompress]" with "COMPRESSION SWEEP FORCE" (R2.1.4): direction is implicit from the master switch's current state; the FORCE keyword indicates it bypasses both the sweeper-enabled config and the periodic timer to do an immediate one-shot pass. - Kept all other rows unchanged (TRAIN, STATUS, DICT LIST/EXPORT/IMPORT/DROP, HELP). §1.3, §2.6, §7 cleanup of stale compression-enabled / SWEEP refs: - §1.3 Goals: "single master switch (compression-master-switch) and zero fixed cost when set to off". - R2.6.2 / R2.6.3 / R2.6.4 / R2.6.8: master-switch enum vocabulary. R2.6.2 covers "master ∈ {compression, off}" (load-keep-compressed path). R2.6.3 covers "master == decompression" (load-and-drain path). R2.6.4 references the master-switch config as the reject-on-corruption gate. R2.6.8 (full-sync RDB always uncompressed) phrased against the new enum. - R2.3.3 (dict-cap unblock paths): now references master-switch + sweeper for drain, and DICT DROP for explicit retirement. - §4.4 step 7 (cap interaction): same updates. - §2.10 R2.10.4 back-pressure remediation table: "Manual COMPRESSION SWEEP" → "Manual COMPRESSION SWEEP FORCE". - §7.1 transparency-mode test config: --compression-master-switch compression + --compression-sweeper enabled (replaces the --compression-enabled yes line). - §7.2 persistence test description: "load with compression-master-switch decompression". - §7.3 perf scenario: compares "off" vs "compression". - §7.5 benchmark driver: same. After these changes the doc has zero remaining references to compression-enabled or to bare COMPRESSION SWEEP (without FORCE). The legacy command names appear only in the §4.5 sentence that explicitly documents their absence in v1. * design(§2.1): R2.1.5 retire policy + rename compression-sweeper → compression-active-sweeper Two corrections from PR-B review: 1. R2.1.5 retire policy. Original PR-B text auto-retired the active dict on both 'compression → decompression' AND 'compression → off'. The latter contradicts the 'off freezes the DB compression state' semantic — retiring the dict starts it draining toward eventual free, which is not 'frozen'. Corrected to retire ONLY on transitions INTO 'decompression' (from 'compression' or from 'off'). New rule: retire the active dict on any transition INTO 'decompression'. All other transitions leave the registry untouched. Specifically: - compression → decompression: retire active dict (drain mode) - off → decompression: retire active dict if there happens to be one (drain mode) - compression → off: NO retire. State frozen. Operator can manually reclaim via COMPRESSION DICT DROP <id> if desired. - decompression → off: no action; active was already retired on the prior transition. - off → compression: no retire; reuse existing active dict if any (no retraining). - decompression → compression: no retire; system enters R2.1.7 third state until a new dict is trained. Effect: master=off truly freezes the registry; the dict stays ACTIVE and ready to resume compression if the operator flips back. Memory cost is bounded (~200 KB raw + digested) and operator-recoverable. 2. Rename compression-sweeper → compression-active-sweeper. Picks up Valkey's existing 'active-X' prefix convention for background-engine configs (matches `activedefrag`, `active-expire-effort`). Operator reading `compression-active-sweeper enabled` immediately recognizes this as the same family of automated background workers. Affected names: - compression-sweeper → compression-active-sweeper - compression-sweeper-interval → compression-active-sweeper-interval - compression_sweeper → compression_active_sweeper (INFO) - compression_sweeper_interval → compression_active_sweeper_interval - compression_sweeper_state → compression_active_sweeper_state Updated across §2.1, §2.3, §2.5, §2.10, §2.12, §3, §4.4, §4.5, §7.1 — every config-name and INFO-field reference. The English word 'sweeper' is preserved in prose where it refers to the engine concept (not the config-name identifier). * docs: update other planning docs to align with declarative model Three companion docs updated alongside detailed-design.md (which landed earlier in this PR): idea-honing.md: - Added a top-level Status note (2026-06-14) pointing readers to detailed-design.md as the source of truth, with explicit callouts for Q5 (master switch model — superseded by the declarative 3-state model) and Q5's operator-surface corollary (COMPRESSION ENABLE/DISABLE and COMPRESSION SWEEP direction=... are NOT part of v1). - Q1–Q4, Q6–Q16 are preserved as the historical walkthrough audit trail. Their decisions still hold under the current design (the inline "superseded by review" annotations from the original walkthrough remain). - Q3's body uses old-vocabulary 'compression-enabled no' but the decision (transparently decompress on load when the feature is off) is still in force; preserved as historical Q&A. The header note covers the vocabulary shift. summary.md: - Activation row: "Opt-in via compression-enabled yes" replaced with the new 3-state declarative model description. - PR #15 read-hot-gap historical bullet: operator-recovery affordance updated from "COMPRESSION SWEEP direction=decompress" to the current model: "CONFIG SET compression-master-switch decompression + compression-active-sweeper enabled". - All other historical bullets preserved unchanged. implementation/plan.md: - S2.9 description rewritten as "Master switch + sweeper (declarative model)" — describes the full declarative-model implementation (3-state master-switch enum + active-sweeper enum + interval + auto-retire on transitions into decompression). - The §4.5-style command-tree mention rewritten: COMPRESSION SWEEP [ASYNC] → COMPRESSION SWEEP FORCE. - S3.3 mention updated: compression-enabled → compression-master- switch. - Phase 1 gate criterion updated: compression-enabled yes → compression-master-switch compression + compression-active- sweeper enabled. detailed-design.md is the source of truth; these companion docs now describe the same model consistently. DESIGN_TODO.md is not edited — its 31 walkthrough threads pre-date the master-switch redesign and their resolutions still hold (e.g., R2.6.8 full-sync RDB uncompressed, the eligibility-predicate revisions, the COW-invariant audit, the §7.5 benchmark suite).
This was referenced Jun 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lands the keyspace sweep state machine (S2.9) and its cron wiring (S2.10) — the v1 mechanism for retroactively converting the keyspace between compressed and uncompressed forms. Operator-driven via a new
COMPRESSION SWEEP [direction=compress|decompress]command.PR scope grew during review: in addition to the original sweep code, this branch carries a design-level correction (R2.1.6 added; R2.1.3 / R2.1.4 / R2.5.7 rewritten) that aligns
compression-enabledsemantics with the rest of Valkey's "stop doing X" master switches, plus two bug fixes uncovered along the way (discardTransientEntrywas leaking dict frame-refs;compressionWorkersStopwas leaking pinned values at shutdown).Master-switch semantics — P1 (R2.1.6)
The biggest design change in this PR is to drop the asymmetric auto-trigger that earlier drafts of R2.1.3 promised. Master-switch toggles now preserve in-memory state on both directions; conversion is operator-driven via explicit
COMPRESSION SWEEPcommands:no → yes(R2.1.3)yes → no(R2.1.4)COMPRESSION SWEEP direction=compressCOMPRESSION SWEEP direction=decompressThis matches the precedent of every other "stop doing X" toggle in the codebase (
appendonly,rdbcompression,notify-keyspace-events,replica-read-only,lazyfree-lazy-*,maxmemory-policy,protected-mode) — all of which preserve existing state on toggle. The auto-trigger affordance is moved to Appendix D as opt-in v2 configs (compression-auto-sweep-on-enable/compression-auto-sweep-on-disable), preserving the option for operators who want eager retroactive conversion without baking it into v1.The state machine itself is documented authoritatively in
detailed-design.md§3.4. Two states (IDLE/SCANNING) plus a direction property; per-cell transitions for every (state × trigger × master-switch) combination spelled out:IDLErequest(COMPRESS)from cmdIDLE.requested(COMPRESS)+OKreplyIDLErequest(COMPRESS)from cmdIDLEIDLErequest(DECOMPRESS)from cmdIDLE.requested(DECOMPRESS)+OKreplyIDLE.requested(D)IDLErequestedIDLE.requested(D)SCANNING(D)SCANNING(D)request(D)(same direction)SCANNING(D)+OKreply (idempotent)SCANNING(D1)request(D2), D1 ≠ D2SCANNING(D1)SCANNING(COMPRESS)SCANNING(COMPRESS)advanceScan(); on completion →IDLESCANNING(COMPRESS)IDLESCANNING(DECOMPRESS)SCANNING(DECOMPRESS)advanceScan()— runs regardless of master switchArchitecture
Mirrors @GilboaAWS's
compressionTrainCronpattern from PR #20: monotime-budgeted state machine spliced across cron ticks with a persistent(DB index, kvstore cursor)cursor. All new sweep logic lives insrc/compression_sweep.{c,h}— same separation ascompression_train.{c,h}.Pacing: per-tick wall-time budget computed from
compression-sweep-max-cpu-pctandserver.hz. Default 25% / hz=10 → 25 ms per tick. 10 µs floor.Two-mode transient-view drain (R2.5.7 amend)
compressionBeforeSleepnow selects between two modes per invocation based on whether an explicit decompress sweep is in flight:compressionSweepIsScanning() && compressionSweepCurrentDirection() == DECOMPRESS. Each side-map entry is finalized as RAW (compressed buffer released; temp sds stays installed). The iteration drain cooperates with the operator-initiated drain instead of fighting it — every read of a still-compressed key during the sweep saves the sweep one item of work.Mode is keyed on the in-flight sweep direction (operator's expressed intent), not on the master-switch state (ambiguous between "temporary disable" and "permanent disable").
New helper exposed:
compressionSweepCurrentDirection()— returnsCOMPRESSION_SWEEP_DIR_COMPRESS/COMPRESSION_SWEEP_DIR_DECOMPRESSwhile SCANNING,0while IDLE.Bug fixes (side effects of the refactor)
discardTransientEntrywaszfree'ing the saved compressed buffer without decrementing the dict frame-ref or reversing accountingcompression.ccompression_total_*_bytesdrift whenever the side-map drain hit the orphan branch (mutation/overwrite/expire while a transient view existed)releaseCompressedBufferhelper that does the full release path; routeddiscardTransientEntrythrough it.compressionPermanentlyDecompressalso refactored onto the helper.compressionWorkersStopleftover-drain waszfree'ing job structs without dropping the caller's pin (incrRefCount(job->value))compression_workers.cdecrRefCount(job->value)in both inbox and outbox leftover-drain loopsNeither bug was reachable on
unstablebefore this PR — both were latent and exposed only by the new tests. Ship-as-fixes-with-the-same-PR rather than carve out follow-ups.R2.x compliance
compressionEnqueueCandidatewhich holdsincrRefCount(value)for the worker (existing PR #19 contract)compressionBeforeSleepsignalModifiedKeyfrom compression pathsserver.dbNULL-guard (PR #25)advanceScan— sameif (db == NULL) continue;patterncompressionSweepCronreads against it line-for-lineFiles
src/compression_sweep.{c,h}src/commands/compression-sweep.jsonsrc/commands.defCOMPRESSION SWEEPsrc/compression.ccompressionSweepcommand handler (direction-conditional master-switch guard);compressionBeforeSleeptwo-mode drain;releaseCompressedBufferhelper;compressionCroncallscompressionSweepCron;discardTransientEntrybug fixsrc/compression_workers.ccompressionWorkersStopleak fixsrc/compression_workers.hsrc/config.capplyCompressionEnabledapply hook (auto-trigger removed); plain bool config registrationcmake/Modules/SourceFiles.cmake,src/Makefile.agents/planning/realtime-data-compression/design/detailed-design.md.agents/planning/realtime-data-compression/implementation/plan.mdsrc/unit/test_compression_sweep.cppTests
src/unit/test_compression_sweep.cpp— 11 cases:CompressDecompressRoundTrip: 100 eligible + 100 ineligible string values; compress sweep flips eligible to COMPRESSED; decompress sweep flips them all back; counters drain to 0.EmptyKeyspaceCompletes,NoActiveDictGracefulNoOp,SparseDbIteratesPastNullSlots(sparseserver.dbwith NULL slots — same regression class as PR fix(S1.2): NULL-guard server.db[j] iteration in compressionTrainCron #25 fixed incompressionTrainCron).SingleFlightRejectsConcurrentRequest: tight pacing forces multi-tick scan; second different-direction request rejected.SweepCurrentDirectionReportsState: validatescompressionSweepCurrentDirection().NoAutoTriggerOnEnable— toggling on does NOT compress existing values; explicit sweep does.CompressRequestClearedAtCronWhenDisabled— defensive: queue a COMPRESS request, then disable, cron clears request without entering SCANNING.DecompressSweepRunsWhenDisabled— operator's drain workflow after disable.InFlightCompressSweepAbortsOnDisable— toggle off mid-sweep aborts; state returns to IDLE.InFlightDecompressSweepContinuesOnDisable— toggle off mid-decompress-sweep continues to completion (drain is independent of master switch).BeforeSleepCooperatesWithDecompressSweep: materialize a transient view, request a paced decompress sweep, callcompressionBeforeSleep, verify the materialized robj is permanently RAW (val_ptr unchanged from materialize, NOT swapped back to compressed).Verified locally
make -j2 -C src SERVER_CFLAGS=-Werrorclean withBUILD_ZSTD=yes(default) andBUILD_ZSTD=no../runtest --single unit/type/compression— all green (including PR fix(S1.2): NULL-guard server.db[j] iteration in compressionTrainCron #25's regression test).CONFIG SET compression-enabled yesdoes NOT auto-trigger a sweep.COMPRESSION SWEEP direction=compressaccepted while enabled.COMPRESSION SWEEP direction=compressrejected while disabled (ERR compression is disabled; only direction=decompress is permitted).COMPRESSION SWEEP direction=decompressaccepted regardless of master switch.started/completedevents with direction; no spurious activity on toggle.Gtest unit tests not runnable locally (
libgtest-devnot installed); CI validates.Out of scope
mutexQueueinbox is unbounded.used_memorygrowth. v2 may addcompression_decompress_paused_oomINFO field plus a backoff cycle.Commits (9)
123ec4473d422ad87657d8844b9107271d4806a70d4e43386bc039releaseCompressedBufferhelper +discardTransientEntrybug fix + 7 new tests4d9180c0a46a470eeecompressionWorkersStopdrain (ASan leak) + harden in-flight-abort test to drain workers post-abort0c74747e2compression_sweep.h,compression_workers.h,compression.ccomments) to match P1 semanticsHistory can be left as-is for the audit trail or squashed before merge — your call.