docs(design): R2.5.6 — read-hot compressed-value gap acknowledgment#15
Merged
Conversation
…edgment
Surfaced during the Valkey-committee design walkthrough: the v1
eligibility filter (R2.2) gates compression decisions at
compression-attempt time but provides no inverse path. A key that was
cold when compressed and later becomes read-hot pays sustained sync
decompression CPU on every read.
Adds R2.5.6 to §2.5 documenting the gap explicitly:
- States the asymmetry plainly so operators know what to expect.
- Notes the existing pressure-relief for write-touched keys
(dbOverwrite + partial-write COW per R2.4.5/R2.7.6).
- Names the existing v1 mitigations: bounded per-read cost via
compression-max-value-size, observability via LATENCY HISTORY
decompress-sync, and operator-driven recovery via
COMPRESSION SWEEP direction=decompress.
- Defers auto-demotion to v2 with the rationale that any heuristic
would trade memory for read latency in a way that's better
decided by the operator (or by measurement-driven v2 work).
Adds matching Appendix D row "Read-hot compressed value auto-demotion"
positioned next to other reactive-runtime-behaviour entries (kill-
switch). Names the v2 mechanism (sweeper-based with hysteresis) and
the configs it would introduce (compression-promote-min-freq /
compression-promote-max-idle-seconds).
No code changes — design-doc-only.
ikolomi
added a commit
that referenced
this pull request
Jun 2, 2026
Reconcile the planning docs with the implementation-phase changes
that landed after the original design walkthrough.
- idea-honing.md Q9: add a "Superseded during implementation"
annotation pointing to the new training-knob model from PR #14
(compression-dict-min-training-keys / -max-training-keys /
compression-training-buffer-size). The original walkthrough text
is preserved below the annotation for historical record.
- detailed-design.md §7.1 transparency-mode example: replace
--compression-dict-first-training-keys-count (removed in PR #14)
with --compression-dict-min-training-keys.
- detailed-design.md §2.12 heading: "Advanced knobs (12)" → "(11)".
Off-by-one count introduced when PR #14 reshuffled the table.
- summary.md: add a new "What changed during implementation"
section between the walkthrough section and the plan summary,
capturing the three substantive post-walkthrough refinements
(PR #14 training rewrite, PR #15 R2.5.6 read-hot gap, PR #13
QSBR plumbing).
Doc-only; no code changes.
ikolomi
added a commit
that referenced
this pull request
Jun 14, 2026
Three companion docs updated alongside detailed-design.md (which
landed earlier in this PR):
idea-honing.md:
- Added a top-level Status note (2026-06-14) pointing readers to
detailed-design.md as the source of truth, with explicit
callouts for Q5 (master switch model — superseded by the
declarative 3-state model) and Q5's operator-surface corollary
(COMPRESSION ENABLE/DISABLE and COMPRESSION SWEEP direction=...
are NOT part of v1).
- Q1–Q4, Q6–Q16 are preserved as the historical walkthrough
audit trail. Their decisions still hold under the current
design (the inline "superseded by review" annotations from the
original walkthrough remain).
- Q3's body uses old-vocabulary 'compression-enabled no' but the
decision (transparently decompress on load when the feature is
off) is still in force; preserved as historical Q&A. The header
note covers the vocabulary shift.
summary.md:
- Activation row: "Opt-in via compression-enabled yes" replaced
with the new 3-state declarative model description.
- PR #15 read-hot-gap historical bullet: operator-recovery
affordance updated from "COMPRESSION SWEEP direction=decompress"
to the current model: "CONFIG SET compression-master-switch
decompression + compression-active-sweeper enabled".
- All other historical bullets preserved unchanged.
implementation/plan.md:
- S2.9 description rewritten as "Master switch + sweeper
(declarative model)" — describes the full declarative-model
implementation (3-state master-switch enum + active-sweeper
enum + interval + auto-retire on transitions into
decompression).
- The §4.5-style command-tree mention rewritten:
COMPRESSION SWEEP [ASYNC] → COMPRESSION SWEEP FORCE.
- S3.3 mention updated: compression-enabled → compression-master-
switch.
- Phase 1 gate criterion updated: compression-enabled yes →
compression-master-switch compression + compression-active-
sweeper enabled.
detailed-design.md is the source of truth; these companion docs
now describe the same model consistently. DESIGN_TODO.md is not
edited — its 31 walkthrough threads pre-date the master-switch
redesign and their resolutions still hold (e.g., R2.6.8 full-sync
RDB uncompressed, the eligibility-predicate revisions, the
COW-invariant audit, the §7.5 benchmark suite).
ikolomi
added a commit
that referenced
this pull request
Jun 14, 2026
…itch + sweeper model (#28) * design(§2.1): rewrite master-switch + sweeper surface as declarative model Replace the bool master switch (compression-enabled yes|no) with the 3-state enum (compression-master-switch: compression / decompression / off, default off) plus a 2-state sweeper enum (compression-sweeper: enabled / disabled, default disabled) plus a sweeper-interval knob (compression-sweeper-interval, default 0) plus a manual force command (COMPRESSION SWEEP FORCE). The operator declares the desired DB state; the sweeper drives the keyspace toward it. This matches Valkey's prevailing declarative config style (maxmemory + maxmemory-policy, save ..., appendonly) and makes the three mechanics orthogonal: 1. Master switch: governs future-write eligibility AND the transient view's drain mode. 2. Sweeper: drives the keyspace toward the declared state. 3. Transient view (R2.5.7, updated separately): drain mode is gated on master-switch state. R2.1.1 \u2014 master-switch enum semantics. The 3 values + what each means for writes / reads / dict lifecycle. R2.1.2 \u2014 sweeper enum semantics. enabled drives toward target state; disabled means no automatic work; with master=off the sweeper has no direction and idles. Decompression sweeps run on the main thread (no worker pool). R2.1.3 \u2014 sweeper-interval semantics. 0 = single pass after master-switch direction change, then idle. Positive N = sleep N seconds between passes. Default 0 minimizes CPU on idle keyspaces; operators with high churn opt in. R2.1.4 \u2014 COMPRESSION SWEEP FORCE command. One-shot pass regardless of sweeper config and interval. Uses master switch's current direction; rejected if master=off. Allowed even with sweeper=disabled (the operator's catch-up affordance). R2.1.5 \u2014 master-switch transitions including auto-retire of the active dict on transitions to decompression or off. The dict cannot accept new frame references in the new state, so retirement is safe (workers are protected by QSBR). With master=decompression + sweeper=enabled running to completion, the dict's memory is auto-reclaimed without operator intervention. R2.1.6 \u2014 configuration validation. All combinations allowed (operator owns controls + Valkey reports). Two non-functional combinations get one-time LL_WARNING logs: - master=off + sweeper=enabled (sweeper idles, documented) - master=compression + threads=0 (write enqueues + sweep enqueues both drop; sweeper skips passes with rate-limited warning) R2.1.7 \u2014 third state preserved (master=compression but no active dictionary). Same content as the prior R2.1.5; renumbered. Replaces the 5 R2.1.x requirements that were on unstable. Commit touches detailed-design.md only; code changes in PR-C. Plan reference: PR-B in the orthogonal-mechanics design rewrite. PR #26's R2.1.6 (master-switch precedent table) and §3.4 (sweep state machine table) are not part of unstable today and are not touched here. * design(§2.5): transient view drain mode gated on master switch state R2.5.7 previously described a single-mode drain: beforeSleep always restores the compressed form. With the new declarative model (compression-master-switch: compression / decompression / off), the drain mode must follow the master switch: - master = compression or off: restore mode (current behavior; compressed form preserved across the iteration boundary). - master = decompression: permanent-decompress mode (compressed buffer released; temp sds installed as the value's val_ptr; encoding stays RAW). The permanent-decompress branch exists because in decompression mode the operator's intent is "drain the DB". A transient view that restores the compressed form on every read-touched key would oscillate the same key between compressed and RAW every iteration — defeating the drain. This is the ONLY cross-mechanism dependency between master switch and transient view (per R2.1's "three orthogonal mechanics" framing — mechanics 1 and 3 share this one coupling, by design). The section calls this out explicitly so future readers know it's deliberate, not accidental. Both modes share the underlying release-and-decRef cleanup (releaseCompressedBuffer for the compressed buffer; same discardTransientEntry helper); only the restore-vs-leave-as-RAW step differs. Also updated R2.5.6 to reference the new master-switch + sweeper combination instead of "COMPRESSION SWEEP direction=decompress" as the operator's drain affordance. * design(§2.10, §2.12): INFO fields + config table for declarative model §2.10.1 INFO fields: - Renamed compression_enabled → compression_master_switch (enum, reports compression / decompression / off per R2.1.1). - Added compression_sweeper (enum, reports enabled / disabled per R2.1.2). - Added compression_sweeper_interval (seconds, reports the R2.1.3 setting). - Added compression_sweeper_state (enum, reports the runtime sweeper engine state: idle / scanning / sleeping / disabled). Distinct from compression_sweeper because the configured switch and the runtime engine state aren't always the same — e.g. with compression-sweeper=enabled but master=off, the engine is idle (no direction); with sweeper=enabled + master=compression actively running, the engine is scanning. - Existing fields (compression_state, dict counters, sweep back-pressure counters, ratio counters) all preserved unchanged. §2.12 config table: - Primary section grew from 5 → 6 knobs: • compression-master-switch (enum, default off) — replaces the former compression-enabled bool entry • compression-sweeper (enum, default disabled) — new • compression-threads kept; description annotated to flag "only relevant in master=compression" per R2.1.6 • size bounds and dict-size unchanged - Advanced section grew from 11 → 12: • compression-sweeper-interval (seconds, default 0) — new • all other advanced knobs unchanged Together these surfaces give the operator the declarative-model view: declare master state, optionally run the sweeper to converge, observe progress via the runtime sweeper_state and the back-pressure counters. * design(§4.5 + §1, §2, §7 sweep): COMPRESSION command surface + cleanup §4.5 COMPRESSION subcommand container: - Removed COMPRESSION ENABLE / COMPRESSION DISABLE rows. The new 3-state master switch (compression / decompression / off) doesn't map cleanly to enable/disable verbs; operators use CONFIG SET directly. Documented this rationale in the section text plus a reference to the precedent (maxmemory-policy, appendfsync are other tri-state Valkey configs without verb aliases). - Replaced "COMPRESSION SWEEP [direction=compress|decompress]" with "COMPRESSION SWEEP FORCE" (R2.1.4): direction is implicit from the master switch's current state; the FORCE keyword indicates it bypasses both the sweeper-enabled config and the periodic timer to do an immediate one-shot pass. - Kept all other rows unchanged (TRAIN, STATUS, DICT LIST/EXPORT/IMPORT/DROP, HELP). §1.3, §2.6, §7 cleanup of stale compression-enabled / SWEEP refs: - §1.3 Goals: "single master switch (compression-master-switch) and zero fixed cost when set to off". - R2.6.2 / R2.6.3 / R2.6.4 / R2.6.8: master-switch enum vocabulary. R2.6.2 covers "master ∈ {compression, off}" (load-keep-compressed path). R2.6.3 covers "master == decompression" (load-and-drain path). R2.6.4 references the master-switch config as the reject-on-corruption gate. R2.6.8 (full-sync RDB always uncompressed) phrased against the new enum. - R2.3.3 (dict-cap unblock paths): now references master-switch + sweeper for drain, and DICT DROP for explicit retirement. - §4.4 step 7 (cap interaction): same updates. - §2.10 R2.10.4 back-pressure remediation table: "Manual COMPRESSION SWEEP" → "Manual COMPRESSION SWEEP FORCE". - §7.1 transparency-mode test config: --compression-master-switch compression + --compression-sweeper enabled (replaces the --compression-enabled yes line). - §7.2 persistence test description: "load with compression-master-switch decompression". - §7.3 perf scenario: compares "off" vs "compression". - §7.5 benchmark driver: same. After these changes the doc has zero remaining references to compression-enabled or to bare COMPRESSION SWEEP (without FORCE). The legacy command names appear only in the §4.5 sentence that explicitly documents their absence in v1. * design(§2.1): R2.1.5 retire policy + rename compression-sweeper → compression-active-sweeper Two corrections from PR-B review: 1. R2.1.5 retire policy. Original PR-B text auto-retired the active dict on both 'compression → decompression' AND 'compression → off'. The latter contradicts the 'off freezes the DB compression state' semantic — retiring the dict starts it draining toward eventual free, which is not 'frozen'. Corrected to retire ONLY on transitions INTO 'decompression' (from 'compression' or from 'off'). New rule: retire the active dict on any transition INTO 'decompression'. All other transitions leave the registry untouched. Specifically: - compression → decompression: retire active dict (drain mode) - off → decompression: retire active dict if there happens to be one (drain mode) - compression → off: NO retire. State frozen. Operator can manually reclaim via COMPRESSION DICT DROP <id> if desired. - decompression → off: no action; active was already retired on the prior transition. - off → compression: no retire; reuse existing active dict if any (no retraining). - decompression → compression: no retire; system enters R2.1.7 third state until a new dict is trained. Effect: master=off truly freezes the registry; the dict stays ACTIVE and ready to resume compression if the operator flips back. Memory cost is bounded (~200 KB raw + digested) and operator-recoverable. 2. Rename compression-sweeper → compression-active-sweeper. Picks up Valkey's existing 'active-X' prefix convention for background-engine configs (matches `activedefrag`, `active-expire-effort`). Operator reading `compression-active-sweeper enabled` immediately recognizes this as the same family of automated background workers. Affected names: - compression-sweeper → compression-active-sweeper - compression-sweeper-interval → compression-active-sweeper-interval - compression_sweeper → compression_active_sweeper (INFO) - compression_sweeper_interval → compression_active_sweeper_interval - compression_sweeper_state → compression_active_sweeper_state Updated across §2.1, §2.3, §2.5, §2.10, §2.12, §3, §4.4, §4.5, §7.1 — every config-name and INFO-field reference. The English word 'sweeper' is preserved in prose where it refers to the engine concept (not the config-name identifier). * docs: update other planning docs to align with declarative model Three companion docs updated alongside detailed-design.md (which landed earlier in this PR): idea-honing.md: - Added a top-level Status note (2026-06-14) pointing readers to detailed-design.md as the source of truth, with explicit callouts for Q5 (master switch model — superseded by the declarative 3-state model) and Q5's operator-surface corollary (COMPRESSION ENABLE/DISABLE and COMPRESSION SWEEP direction=... are NOT part of v1). - Q1–Q4, Q6–Q16 are preserved as the historical walkthrough audit trail. Their decisions still hold under the current design (the inline "superseded by review" annotations from the original walkthrough remain). - Q3's body uses old-vocabulary 'compression-enabled no' but the decision (transparently decompress on load when the feature is off) is still in force; preserved as historical Q&A. The header note covers the vocabulary shift. summary.md: - Activation row: "Opt-in via compression-enabled yes" replaced with the new 3-state declarative model description. - PR #15 read-hot-gap historical bullet: operator-recovery affordance updated from "COMPRESSION SWEEP direction=decompress" to the current model: "CONFIG SET compression-master-switch decompression + compression-active-sweeper enabled". - All other historical bullets preserved unchanged. implementation/plan.md: - S2.9 description rewritten as "Master switch + sweeper (declarative model)" — describes the full declarative-model implementation (3-state master-switch enum + active-sweeper enum + interval + auto-retire on transitions into decompression). - The §4.5-style command-tree mention rewritten: COMPRESSION SWEEP [ASYNC] → COMPRESSION SWEEP FORCE. - S3.3 mention updated: compression-enabled → compression-master- switch. - Phase 1 gate criterion updated: compression-enabled yes → compression-master-switch compression + compression-active- sweeper enabled. detailed-design.md is the source of truth; these companion docs now describe the same model consistently. DESIGN_TODO.md is not edited — its 31 walkthrough threads pre-date the master-switch redesign and their resolutions still hold (e.g., R2.6.8 full-sync RDB uncompressed, the eligibility-predicate revisions, the COW-invariant audit, the §7.5 benchmark suite).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Surfaced during the Valkey-committee design walkthrough: the v1 eligibility filter (R2.2) gates compression decisions at compression-attempt time but provides no inverse path. A key that was cold when compressed and later becomes read-hot pays sustained sync decompression CPU on every read.
This PR makes the gap explicit in the design doc rather than leaving it implicit. No code changes.
What's in the design
Adds R2.5.6 to §2.5 (Decompression path)
States plainly:
dbOverwriteinstalls a fresh uncompressed robj; partial-write commands (APPEND,SETRANGE, bit operations, module DMA-write) decompress in place before mutating per the R2.4.5 audit list.compression-max-value-size(R2.5.4: 128 KiB → ~128 µs at ~1 µs/KB).LATENCY HISTORY decompress-sync(R2.10.2).COMPRESSION SWEEP direction=decompress.Adds an Appendix D row
Read-hot compressed value auto-demotion— placed next to the other reactive-runtime-behaviour entries (kill-switch). Describes the v2 mechanism:compression-promote-min-freq(LFU mode) /compression-promote-max-idle-seconds(LRU/noeviction) configs.Why defer auto-demotion to v2 rather than ship in v1
Verified