[design] Rewrite §2.1/§2.5/§2.10/§2.12/§4.5 for declarative master-switch + sweeper model#28
Merged
Merged
Conversation
…model
Replace the bool master switch (compression-enabled yes|no) with the
3-state enum (compression-master-switch: compression / decompression
/ off, default off) plus a 2-state sweeper enum
(compression-sweeper: enabled / disabled, default disabled) plus a
sweeper-interval knob (compression-sweeper-interval, default 0) plus
a manual force command (COMPRESSION SWEEP FORCE).
The operator declares the desired DB state; the sweeper drives the
keyspace toward it. This matches Valkey's prevailing declarative
config style (maxmemory + maxmemory-policy, save ..., appendonly)
and makes the three mechanics orthogonal:
1. Master switch: governs future-write eligibility AND the
transient view's drain mode.
2. Sweeper: drives the keyspace toward the declared state.
3. Transient view (R2.5.7, updated separately): drain mode is
gated on master-switch state.
R2.1.1 \u2014 master-switch enum semantics. The 3 values + what each
means for writes / reads / dict lifecycle.
R2.1.2 \u2014 sweeper enum semantics. enabled drives toward target
state; disabled means no automatic work; with master=off the
sweeper has no direction and idles. Decompression sweeps run
on the main thread (no worker pool).
R2.1.3 \u2014 sweeper-interval semantics. 0 = single pass after
master-switch direction change, then idle. Positive N = sleep
N seconds between passes. Default 0 minimizes CPU on idle
keyspaces; operators with high churn opt in.
R2.1.4 \u2014 COMPRESSION SWEEP FORCE command. One-shot pass
regardless of sweeper config and interval. Uses master switch's
current direction; rejected if master=off. Allowed even with
sweeper=disabled (the operator's catch-up affordance).
R2.1.5 \u2014 master-switch transitions including auto-retire of
the active dict on transitions to decompression or off. The dict
cannot accept new frame references in the new state, so retirement
is safe (workers are protected by QSBR). With
master=decompression + sweeper=enabled running to completion,
the dict's memory is auto-reclaimed without operator
intervention.
R2.1.6 \u2014 configuration validation. All combinations allowed
(operator owns controls + Valkey reports). Two non-functional
combinations get one-time LL_WARNING logs:
- master=off + sweeper=enabled (sweeper idles, documented)
- master=compression + threads=0 (write enqueues + sweep
enqueues both drop; sweeper skips passes with rate-limited
warning)
R2.1.7 \u2014 third state preserved (master=compression but no
active dictionary). Same content as the prior R2.1.5; renumbered.
Replaces the 5 R2.1.x requirements that were on unstable. Commit
touches detailed-design.md only; code changes in PR-C.
Plan reference: PR-B in the orthogonal-mechanics design rewrite.
PR #26's R2.1.6 (master-switch precedent table) and §3.4 (sweep
state machine table) are not part of unstable today and are not
touched here.
R2.5.7 previously described a single-mode drain: beforeSleep always
restores the compressed form. With the new declarative model
(compression-master-switch: compression / decompression / off), the
drain mode must follow the master switch:
- master = compression or off: restore mode (current behavior;
compressed form preserved across the iteration boundary).
- master = decompression: permanent-decompress mode (compressed
buffer released; temp sds installed as the value's val_ptr;
encoding stays RAW).
The permanent-decompress branch exists because in decompression mode
the operator's intent is "drain the DB". A transient view that
restores the compressed form on every read-touched key would
oscillate the same key between compressed and RAW every iteration —
defeating the drain.
This is the ONLY cross-mechanism dependency between master switch
and transient view (per R2.1's "three orthogonal mechanics" framing
— mechanics 1 and 3 share this one coupling, by design). The
section calls this out explicitly so future readers know it's
deliberate, not accidental.
Both modes share the underlying release-and-decRef cleanup
(releaseCompressedBuffer for the compressed buffer; same
discardTransientEntry helper); only the restore-vs-leave-as-RAW
step differs.
Also updated R2.5.6 to reference the new master-switch + sweeper
combination instead of "COMPRESSION SWEEP direction=decompress" as
the operator's drain affordance.
§2.10.1 INFO fields:
- Renamed compression_enabled → compression_master_switch (enum,
reports compression / decompression / off per R2.1.1).
- Added compression_sweeper (enum, reports enabled / disabled per
R2.1.2).
- Added compression_sweeper_interval (seconds, reports the
R2.1.3 setting).
- Added compression_sweeper_state (enum, reports the runtime
sweeper engine state: idle / scanning / sleeping / disabled).
Distinct from compression_sweeper because the configured switch
and the runtime engine state aren't always the same — e.g. with
compression-sweeper=enabled but master=off, the engine is idle
(no direction); with sweeper=enabled + master=compression
actively running, the engine is scanning.
- Existing fields (compression_state, dict counters, sweep
back-pressure counters, ratio counters) all preserved unchanged.
§2.12 config table:
- Primary section grew from 5 → 6 knobs:
• compression-master-switch (enum, default off) — replaces the
former compression-enabled bool entry
• compression-sweeper (enum, default disabled) — new
• compression-threads kept; description annotated to flag
"only relevant in master=compression" per R2.1.6
• size bounds and dict-size unchanged
- Advanced section grew from 11 → 12:
• compression-sweeper-interval (seconds, default 0) — new
• all other advanced knobs unchanged
Together these surfaces give the operator the declarative-model
view: declare master state, optionally run the sweeper to converge,
observe progress via the runtime sweeper_state and the back-pressure
counters.
§4.5 COMPRESSION subcommand container:
- Removed COMPRESSION ENABLE / COMPRESSION DISABLE rows. The new
3-state master switch (compression / decompression / off) doesn't
map cleanly to enable/disable verbs; operators use CONFIG SET
directly. Documented this rationale in the section text plus a
reference to the precedent (maxmemory-policy, appendfsync are
other tri-state Valkey configs without verb aliases).
- Replaced "COMPRESSION SWEEP [direction=compress|decompress]" with
"COMPRESSION SWEEP FORCE" (R2.1.4): direction is implicit from
the master switch's current state; the FORCE keyword indicates
it bypasses both the sweeper-enabled config and the periodic
timer to do an immediate one-shot pass.
- Kept all other rows unchanged (TRAIN, STATUS, DICT
LIST/EXPORT/IMPORT/DROP, HELP).
§1.3, §2.6, §7 cleanup of stale compression-enabled / SWEEP refs:
- §1.3 Goals: "single master switch (compression-master-switch)
and zero fixed cost when set to off".
- R2.6.2 / R2.6.3 / R2.6.4 / R2.6.8: master-switch enum vocabulary.
R2.6.2 covers "master ∈ {compression, off}" (load-keep-compressed
path). R2.6.3 covers "master == decompression" (load-and-drain
path). R2.6.4 references the master-switch config as the
reject-on-corruption gate. R2.6.8 (full-sync RDB always
uncompressed) phrased against the new enum.
- R2.3.3 (dict-cap unblock paths): now references master-switch
+ sweeper for drain, and DICT DROP for explicit retirement.
- §4.4 step 7 (cap interaction): same updates.
- §2.10 R2.10.4 back-pressure remediation table: "Manual
COMPRESSION SWEEP" → "Manual COMPRESSION SWEEP FORCE".
- §7.1 transparency-mode test config: --compression-master-switch
compression + --compression-sweeper enabled (replaces the
--compression-enabled yes line).
- §7.2 persistence test description: "load with
compression-master-switch decompression".
- §7.3 perf scenario: compares "off" vs "compression".
- §7.5 benchmark driver: same.
After these changes the doc has zero remaining references to
compression-enabled or to bare COMPRESSION SWEEP (without FORCE).
The legacy command names appear only in the §4.5 sentence that
explicitly documents their absence in v1.
…pression-active-sweeper
Two corrections from PR-B review:
1. R2.1.5 retire policy. Original PR-B text auto-retired the active
dict on both 'compression → decompression' AND 'compression → off'.
The latter contradicts the 'off freezes the DB compression state'
semantic — retiring the dict starts it draining toward eventual
free, which is not 'frozen'. Corrected to retire ONLY on
transitions INTO 'decompression' (from 'compression' or from
'off').
New rule: retire the active dict on any transition INTO
'decompression'. All other transitions leave the registry
untouched. Specifically:
- compression → decompression: retire active dict (drain mode)
- off → decompression: retire active dict if there
happens to be one (drain mode)
- compression → off: NO retire. State frozen. Operator can
manually reclaim via COMPRESSION DICT DROP <id> if desired.
- decompression → off: no action; active was already retired
on the prior transition.
- off → compression: no retire; reuse existing active dict if
any (no retraining).
- decompression → compression: no retire; system enters
R2.1.7 third state until a new dict is trained.
Effect: master=off truly freezes the registry; the dict stays
ACTIVE and ready to resume compression if the operator flips
back. Memory cost is bounded (~200 KB raw + digested) and
operator-recoverable.
2. Rename compression-sweeper → compression-active-sweeper.
Picks up Valkey's existing 'active-X' prefix convention for
background-engine configs (matches `activedefrag`,
`active-expire-effort`). Operator reading
`compression-active-sweeper enabled` immediately recognizes
this as the same family of automated background workers.
Affected names:
- compression-sweeper → compression-active-sweeper
- compression-sweeper-interval → compression-active-sweeper-interval
- compression_sweeper → compression_active_sweeper (INFO)
- compression_sweeper_interval → compression_active_sweeper_interval
- compression_sweeper_state → compression_active_sweeper_state
Updated across §2.1, §2.3, §2.5, §2.10, §2.12, §3, §4.4, §4.5,
§7.1 — every config-name and INFO-field reference. The English
word 'sweeper' is preserved in prose where it refers to the
engine concept (not the config-name identifier).
Three companion docs updated alongside detailed-design.md (which
landed earlier in this PR):
idea-honing.md:
- Added a top-level Status note (2026-06-14) pointing readers to
detailed-design.md as the source of truth, with explicit
callouts for Q5 (master switch model — superseded by the
declarative 3-state model) and Q5's operator-surface corollary
(COMPRESSION ENABLE/DISABLE and COMPRESSION SWEEP direction=...
are NOT part of v1).
- Q1–Q4, Q6–Q16 are preserved as the historical walkthrough
audit trail. Their decisions still hold under the current
design (the inline "superseded by review" annotations from the
original walkthrough remain).
- Q3's body uses old-vocabulary 'compression-enabled no' but the
decision (transparently decompress on load when the feature is
off) is still in force; preserved as historical Q&A. The header
note covers the vocabulary shift.
summary.md:
- Activation row: "Opt-in via compression-enabled yes" replaced
with the new 3-state declarative model description.
- PR #15 read-hot-gap historical bullet: operator-recovery
affordance updated from "COMPRESSION SWEEP direction=decompress"
to the current model: "CONFIG SET compression-master-switch
decompression + compression-active-sweeper enabled".
- All other historical bullets preserved unchanged.
implementation/plan.md:
- S2.9 description rewritten as "Master switch + sweeper
(declarative model)" — describes the full declarative-model
implementation (3-state master-switch enum + active-sweeper
enum + interval + auto-retire on transitions into
decompression).
- The §4.5-style command-tree mention rewritten:
COMPRESSION SWEEP [ASYNC] → COMPRESSION SWEEP FORCE.
- S3.3 mention updated: compression-enabled → compression-master-
switch.
- Phase 1 gate criterion updated: compression-enabled yes →
compression-master-switch compression + compression-active-
sweeper enabled.
detailed-design.md is the source of truth; these companion docs
now describe the same model consistently. DESIGN_TODO.md is not
edited — its 31 walkthrough threads pre-date the master-switch
redesign and their resolutions still hold (e.g., R2.6.8 full-sync
RDB uncompressed, the eligibility-predicate revisions, the
COW-invariant audit, the §7.5 benchmark suite).
e6b0892 to
2df82ea
Compare
This was referenced Jun 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Design-doc rewrite for the declarative master-switch model (replaces the bool
compression-enabledmodel and the in-flights2-sweepdesign that introduced master-switch coupling in the sweep state machine).This PR contains changes to four planning docs:
design/detailed-design.md(the source of truth — all design changes here)idea-honing.md(header note pointing todetailed-design.md; Q5 historical Q&A flagged as superseded)summary.md(activation row + one historical bullet updated)implementation/plan.md(S2.9 description rewritten + S3.3 / Phase-1 gate refs updated)No code changes. Implementation lands in PR-C (separate, depends on PR-B merging).
PR #26 will be closed as superseded once PR-C lands.
The new model — three orthogonal mechanics
compression-master-switch(enum:compression/decompression/off, defaultoff)compression-active-sweeper(enum:enabled/disabled, defaultdisabled) +compression-active-sweeper-interval(time, default0)Plus one operator command:
COMPRESSION SWEEP FORCE— one-shot pass; uses master switch's current direction; rejected ifmaster=off; allowed even withcompression-active-sweeper=disabled(the operator's catch-up affordance).Why declarative
This pattern matches Valkey's prevailing config style (
maxmemory+maxmemory-policy,save ...,appendonly). Operator declares a goal; the engine maintains it. Compared to the prior imperative model (compression-enabled yes/no+COMPRESSION SWEEP direction=compress|decompress), the declarative model gives the operator a smaller surface to learn (2 enum + 1 time config + 1 imperative escape hatch, vs 1 bool + 4 commands + cross-mechanism coupling).Why
compression-active-sweeperPicks up Valkey's existing
active-Xprefix convention for background-engine configs (activedefrag,active-expire-effort). Operator readingcompression-active-sweeper enabledimmediately recognizes this as the same family of automated background workers.Commits in this PR
bb524b296e00db2756master ∈ {compression, off}; permanent-decompress mode formaster == decompression. The only intentional cross-mechanism coupling in the design.bccd0367928e086dfaCOMPRESSION SWEEP [direction=…]removed; replaced byCOMPRESSION SWEEP FORCE. LegacyCOMPRESSION ENABLE/DISABLEdocumented as intentionally absent. Straycompression-enabledreferences swept.d6989e900compression-sweeper→compression-active-sweepercompression → offdoes NOT auto-retire (state frozen); only transitions INTOdecompressionretire the active dict. Rename matchesactivedefragprecedent.2df82eabcidea-honing.md+summary.md+plan.mdTotal: 4 files, +102 / −34.
Significant design choices documented
These were settled by discussion in the prior session and are now captured in the docs:
Master switch state semantics (R2.1.1):
compression: write-path eligible; transient view restores.decompression: write-path inert; transient view permanent-decompresses; active dict auto-retired.off: feature paused; DB state frozen; transient view restores.Auto-retire active dict only on transitions INTO
decompression(R2.1.5). The earlier formulation (auto-retire on bothcompression → decompressionANDcompression → off) contradicted the "off freezes the DB compression state" semantic — retiring starts the dict draining, which isn't frozen. Fixed: retire only when enteringdecompression. Theoffstate preserves the registry as-is; operator can manually reclaim viaCOMPRESSION DICT DROP <id>if desired.All six transitions documented explicitly:
compression → decompressioncompression → offoff → decompressiondecompression → offoff → compressiondecompression → compressionSweeper-interval default
0= no periodic re-runs (R2.1.3). Initial pass on direction change runs unconditionally; re-runs are operator opt-in. Justified by the cost analysis: scanning the full keyspace once is O(keyspace). For typical workloads (1M–10M keys) negligible; for very large workloads (100M+ keys) non-trivial. Default0ensures the feature has no surprising idle CPU baseline.COMPRESSION SWEEP FORCEallowed regardless ofcompression-active-sweeper(R2.1.4). This is the operator's catch-up affordance — they can leave the active sweeper disabled (no idle iteration) and runFORCEfrom a runbook or after observingcompression_candidates_dropped_totalclimb.Configuration validation: all combinations allowed; non-functional combos warn but don't reject (R2.1.6). Specifically:
master=off + compression-active-sweeper=enabled: sweeper has no direction; idles. Documented as no-op state.master=compression + compression-threads=0: write enqueues drop, sweeper passes drop. Logged atLL_WARNINGonce on transition into this state. The sweeper, when scheduled in this state, skips the pass with a rate-limited warning.Transient view's drain mode is gated on master switch — explicit single cross-mechanism dependency (R2.5.7). The doc calls this out as the only intentional coupling between the three orthogonal mechanics, with rationale (decompression drain would oscillate indefinitely if the transient view kept restoring).
What's NOT in this PR
compression.c,compression_workers.c,compression_sweep.c,config.c,db.c, plus tests. Significantly larger; will likely split into C1/C2/C3.§3.4sweeper architecture section. The R2.1 requirements describe the engine sufficiently for v1 review; can add a§3.4later if reviewers want a separate architecture-level treatment.DESIGN_TODO.md. The 31 walkthrough threads pre-date the master-switch redesign; their resolutions remain correct under the new model. No edit needed unless reviewers find a specific thread that conflicts.Verified
grep -an "compression-enabled" detailed-design.md→ 0 matches (clean).grep -an "COMPRESSION SWEEP"(excluding FORCE) → 0 matches in detailed-design.md (clean). One mention inidea-honing.mdis the legacy syntax preserved as historical Q&A; the header note flags it as superseded.grep -an "compression-sweeper"(without-active-) → 0 matches in detailed-design.md / summary.md / plan.md (clean).grep -an "COMPRESSION ENABLE\|COMPRESSION DISABLE"→ 1 match in §4.5 of detailed-design.md (the explanatory text documenting their intentional absence in v1).unstable(post-PR-A) → 1281 lines after PR-B (+59 net).PR #26 stays open during PR-A and PR-B review; closes when PR-C merges.