Skip to content

Inline compression: Phase 0 skeleton#3

Merged
ikolomi merged 5 commits into
unstablefrom
phase-0-skeleton
May 17, 2026
Merged

Inline compression: Phase 0 skeleton#3
ikolomi merged 5 commits into
unstablefrom
phase-0-skeleton

Conversation

@ikolomi

@ikolomi ikolomi commented May 12, 2026

Copy link
Copy Markdown
Owner

Compileable skeleton that seals the interface contracts between the subsystems of the real-time data compression feature so the remaining implementation work can proceed in parallel.

Feature defaults off; zero behavior change versus pre-commit.

Contents

Interface contract headers (new, under src/):

  • compression.h public API surface: lifecycle, toggle,
    read/write hot-path hooks, INFO, COMPRESSION
    subcommand entry points
  • compression_registry.h dictionary registry + compressionDictPair
    (ZSTD_CDict/DDict forward-declared as opaque)
  • compression_header.h per-value 16-byte header + create/free helpers
  • compression_workers.h worker pool + SPMC inbox / MPSC outbox contract
  • compression_train.h bio training job submission + main-thread
    completion callback

Feature-disabled stub implementations (new, under src/):

  • compression.c COMPRESSION STATUS returns a static disabled
    text block; all hot-path entry points pass
    robjs through unchanged; enqueue is a no-op
    sink; INFO section renders disabled fields
  • compression_header.c working encode/decode of the 16-byte header
    (four uint32s + magic); allocator helpers
    are stubs
  • compression_registry.c registry ops all return NULL / 0 / no-op
  • compression_workers.c pool lifecycle + enqueue/drain are no-ops
  • compression_train.c train triggers / sampling / completion are
    no-ops with defensive ownership handling

Persistence reservation:

  • src/rdb.h RDB_ENC_ZSTDDICT = 4 reserved with a comment
    describing the planned on-disk layout.
    RDB_VERSION is not bumped (no writer yet).

Configuration:

  • src/server.h 16 new compression_* fields on struct
    valkeyServer matching the design's §2.12
    primary+advanced tiers
  • src/config.c 16 new compression-* config knobs registered
    with the defaults documented in the design.
    All are MODIFIABLE_CONFIG except the
    compression-cpulist affinity string, which
    mirrors the existing bio-cpulist precedent.

Command surface:

  • src/commands/compression.json, compression-status.json, compression-help.json plus regenerated src/commands.def. STATUS returns a disabled-state INFO-formatted text block; HELP lists the Phase 0 surface; other COMPRESSION subcommands are intentionally absent in this milestone.

Event-loop integration (src/server.c):

  • compressionInit() in InitServerLast after initIOThreads
  • compressionCron() in serverCron right after databasesCron
  • compressionAfterSleep() at the end of afterSleep
  • INFO compression section in genValkeyInfoString

Build:

  • src/Makefile new BUILD_ZSTD flag (default yes).
    Compression *.o files added to
    ENGINE_SERVER_OBJ
  • cmake/Modules/SourceFiles.cmake same five sources added to
    VALKEY_SERVER_SRCS

CI + tests:

  • .github/workflows/ci.yml new job test-ubuntu-latest-compression-off
    that builds with BUILD_ZSTD=no and runs
    the compression fixture, guarding the
    feature-off path against regressions
  • tests/unit/type/compression.tcl 8 assertions covering: STATUS returns
    disabled, HELP renders, INFO section
    present, all 16 config defaults as
    documented, basic and large STRING
    round-trip unchanged, encoding is not
    compressed, runtime toggle has no
    observable effect in Phase 0

Not in this commit

  • ZSTD library vendoring under deps/zstd/ (tracked as a follow-up; no zstd symbol is referenced from any file in this commit, so the build succeeds without the library)
  • Any hot-path code (eligibility predicate, encode, decode, sweep, drift retrain). These are the next-milestone deliverables
  • RDB read/write code for the reserved RDB_ENC_ZSTDDICT byte
  • Full COMPRESSION DICT * / SWEEP / TRAIN subcommand tree
  • --compression transparency test harness

Verification

make BUILD_ZSTD=yes OK (default)
make BUILD_ZSTD=no OK
./runtest --single unit/type/compression 8/8 passed

Design-of-record and per-decision audit trail:
.agents/planning/realtime-data-compression/design/detailed-design.md
.agents/planning/realtime-data-compression/implementation/plan.md
.agents/planning/realtime-data-compression/DESIGN_TODO.md

Compileable skeleton that seals the interface contracts between the
subsystems of the real-time data compression feature so the remaining
implementation work can proceed in parallel.

Feature defaults off; zero behavior change versus pre-commit.

Contents
--------

Interface contract headers (new, under src/):
  - compression.h          public API surface: lifecycle, toggle,
                           read/write hot-path hooks, INFO, COMPRESSION
                           subcommand entry points
  - compression_registry.h dictionary registry + compressionDictPair
                           (ZSTD_CDict/DDict forward-declared as opaque)
  - compression_header.h   per-value 16-byte header + create/free helpers
  - compression_workers.h  worker pool + SPMC inbox / MPSC outbox contract
  - compression_train.h    bio training job submission + main-thread
                           completion callback

Feature-disabled stub implementations (new, under src/):
  - compression.c          COMPRESSION STATUS returns a static disabled
                           text block; all hot-path entry points pass
                           robjs through unchanged; enqueue is a no-op
                           sink; INFO section renders disabled fields
  - compression_header.c   working encode/decode of the 16-byte header
                           (four uint32s + magic); allocator helpers
                           are stubs
  - compression_registry.c registry ops all return NULL / 0 / no-op
  - compression_workers.c  pool lifecycle + enqueue/drain are no-ops
  - compression_train.c    train triggers / sampling / completion are
                           no-ops with defensive ownership handling

Persistence reservation:
  - src/rdb.h              RDB_ENC_ZSTDDICT = 4 reserved with a comment
                           describing the planned on-disk layout.
                           RDB_VERSION is not bumped (no writer yet).

Configuration:
  - src/server.h           16 new compression_* fields on struct
                           valkeyServer matching the design's §2.12
                           primary+advanced tiers
  - src/config.c           16 new compression-* config knobs registered
                           with the defaults documented in the design.
                           All are MODIFIABLE_CONFIG except the
                           compression-cpulist affinity string, which
                           mirrors the existing bio-cpulist precedent.

Command surface:
  - src/commands/compression.json, compression-status.json,
    compression-help.json
    plus regenerated src/commands.def. STATUS returns a disabled-state
    INFO-formatted text block; HELP lists the Phase 0 surface; other
    COMPRESSION subcommands are intentionally absent in this milestone.

Event-loop integration (src/server.c):
  - compressionInit()       in InitServerLast after initIOThreads
  - compressionCron()       in serverCron right after databasesCron
  - compressionAfterSleep() at the end of afterSleep
  - INFO compression section in genValkeyInfoString

Build:
  - src/Makefile                    new BUILD_ZSTD flag (default yes).
                                    Compression *.o files added to
                                    ENGINE_SERVER_OBJ
  - cmake/Modules/SourceFiles.cmake same five sources added to
                                    VALKEY_SERVER_SRCS

CI + tests:
  - .github/workflows/ci.yml      new job test-ubuntu-latest-compression-off
                                  that builds with BUILD_ZSTD=no and runs
                                  the compression fixture, guarding the
                                  feature-off path against regressions
  - tests/unit/type/compression.tcl  8 assertions covering: STATUS returns
                                     disabled, HELP renders, INFO section
                                     present, all 16 config defaults as
                                     documented, basic and large STRING
                                     round-trip unchanged, encoding is not
                                     compressed, runtime toggle has no
                                     observable effect in Phase 0

Not in this commit
------------------
  - ZSTD library vendoring under deps/zstd/ (tracked as a follow-up; no
    zstd symbol is referenced from any file in this commit, so the build
    succeeds without the library)
  - Any hot-path code (eligibility predicate, encode, decode, sweep,
    drift retrain). These are the next-milestone deliverables
  - RDB read/write code for the reserved RDB_ENC_ZSTDDICT byte
  - Full `COMPRESSION DICT *` / SWEEP / TRAIN subcommand tree
  - `--compression` transparency test harness

Verification
------------
  make BUILD_ZSTD=yes     OK (default)
  make BUILD_ZSTD=no      OK
  ./runtest --single unit/type/compression   8/8 passed

Design-of-record and per-decision audit trail:
  .agents/planning/realtime-data-compression/design/detailed-design.md
  .agents/planning/realtime-data-compression/implementation/plan.md
  .agents/planning/realtime-data-compression/DESIGN_TODO.md
ikolomi added 4 commits May 13, 2026 10:50
Four design refinements from @ikolomi review of the Phase 0 skeleton.
Purely a contract / documentation change — no behavior change, and
the skeleton still compiles with BUILD_ZSTD=yes and BUILD_ZSTD=no
and still passes the 8/8 Tcl fixture.

1. compressionJob.src becomes typed `sds` (was `unsigned char *`);
   src_len dropped. Workers call sdslen(src) now. The length lives in
   the sds metadata and is stable across threads because the
   immutable-snapshot invariant (R2.4.4) guarantees the metadata bytes
   are not mutated while the worker holds the reference.

2. §4.6 picks up a rationale paragraph explaining why the main thread
   snapshots the active dict_id at enqueue (registry stays
   single-writer with no worker readers, refcount bookkeeping is
   paired cleanly, staleness is bounded and harmless) rather than
   letting the worker read the registry.

3. §4.6 picks up a rationale paragraph explaining why the worker
   produces a flat `dst` buffer rather than a compressed robj. This
   is §2.11 R2.11.4: robj manipulation in Valkey assumes
   single-threaded access (no atomics on refcount, shared-object
   singletons, LRU/LFU bit updates); moving robj work to workers
   would silently break those assumptions. Failed compressions cost
   less too — discard is zfree(dst) instead of tearing down a full
   robj.

4. Per-value header field rename:
       magic   → alg_magic    (algorithm tag doubling as magic)
       dict_id → alg_meta     (per-algorithm metadata)
   plus constant COMPRESSION_HEADER_MAGIC → COMPRESSION_ALG_ZSTD_MAGIC
   (ASCII "ZSTD" as a four-byte tag). Same 16-byte footprint. RDB
   encoding byte renamed RDB_ENC_ZSTDDICT → RDB_ENC_COMPRESSED;
   on-disk layout now [RDB_ENCVAL | alg_magic | alg_meta |
   uncompressed_len | compressed_len | payload]. This makes the
   in-memory + on-disk formats structurally extensible to LZ4 /
   snappy / hardware backends without another encoding-byte
   migration. v1 still only emits and accepts the ZSTD magic.

   createCompressedObject doc comment tightened to make the
   ownership-transfer / zero-copy contract explicit: the caller's
   zmalloc'd buffer becomes the robj's storage, no memcpy, and on
   validation failure the caller retains ownership. This prevents
   a Phase 1 implementer from quietly adding a memcpy on the hot
   install path.

Design-doc updates tracked: §2.6 R2.6.1, §4.2 (file table), §4.6
(compressionJob struct + two rationale paragraphs), §5.2 (header
layout + ownership contract + worker shrink decision), §5.3 (RDB
on-disk layout), Appendix B. Living companion documents
(idea-honing.md, research/persistence-and-replication.md,
implementation/plan.md §4.3) updated to match. The frozen PR-review
audit trail (DESIGN_TODO.md, pr-feedback.json) keeps the original
names since it captures text as reviewers wrote it at walkthrough
time.

Verification
------------
  make BUILD_ZSTD=yes                          OK
  make BUILD_ZSTD=no                           OK (after clean)
  ./runtest --single unit/type/compression     8/8 passed
The detailed design documents an SPMC inbox / MPSC outbox feeding the
compression worker pool but leaves three things unstated:

  - whether the queues are shared across workers or per-worker
  - how the queues are bounded
  - what happens when the enqueue rate exceeds worker throughput

This commit pins all three down and wires the operator-visible
observability so "compression isn't keeping up" can be diagnosed to
one of four distinct root causes without reading logs.

Design updates (detailed-design.md)
-----------------------------------

§4.6 gains three new paragraphs:

  - Shared queues, not per-worker — work-stealing falls out of a
    shared SPMC inbox; per-worker queues would need "pick a worker"
    logic that costs more on the main thread and balances worse when
    job costs vary. Same shape as src/io_threads.c.

  - Sizing — inbox capacity = max(256, 128 * compression-threads);
    outbox capacity equals inbox capacity. No new config knob in v1;
    the formula is computed at pool start. Exposing it as a knob later
    is a non-breaking addition.

  - Back-pressure policy per caller — write-path drops with a counter
    bump; sweeper pauses its iteration cursor (distinct from
    CPU-pacing sleeps); future multi-key fan-out drops; worker retries
    on outbox full (never discards completed work).

§2.10 gains R2.10.4 with a remediation table: one metric per distinct
root cause, each paired with the knob the operator should turn if the
metric climbs.

§5.6 gains four new INFO fields.

Observability fields
--------------------

Four new counters + one renamed/clarified gauge:

  - compression_candidates_pending           (gauge — inbox depth)
  - compression_candidates_dropped_total     (write-path drops)
  - compression_sweep_backpressure_total     (sweep paused: queue full)
  - compression_sweep_pacing_sleeps_total    (sweep paused: CPU pacing)
  - compression_outbox_backpressure_total    (worker retry on outbox)

Why sweep_backpressure_total and sweep_pacing_sleeps_total are split:
they have different remedies (more workers vs. looser pacing). A
single counter would hide which knob to turn.

Why write-path drops and future multi-key drops share one counter:
same remediation (more workers). Distinguishing them wouldn't drive a
different action. Splittable later, non-breaking.

Code changes
------------

src/compression.c:
  - Extracted compressionRenderFields() so COMPRESSION STATUS and
    the INFO # Compression section render from one source of truth.
    Previously infoCompression() emitted a 6-field subset and
    compressionStatus() emitted 18; the design (§4.5) requires them
    to be identical.
  - Full 22-field set now includes the four new back-pressure counters.

tests/unit/type/compression.tcl:
  - Added test: STATUS and INFO compression field sets are identical
    (prevents future drift between the two renderers).
  - Added test: the five back-pressure observability fields are
    present and emit 0 in the disabled state.

Verification
------------
  make BUILD_ZSTD=yes                          OK
  make BUILD_ZSTD=no                           OK (after clean)
  ./runtest --single unit/type/compression     10/10 passed (was 8/8)
Nine failing check runs on the Phase 0 PR all traced to three root
causes, addressed here:

(1) introspection.tcl CONFIG sanity (7 jobs)
----------------------------------------------
test-ubuntu-latest, test-ubuntu-latest-cmake-tls,
test-sanitizer-address, test-external-standalone,
test-external-cluster, test-external-nodebug, and code-coverage
all died at:

    [exception]: Executing test client: ERR CONFIG SET failed
      (possibly related to argument 'compression-cpulist')
      - can't set immutable config.

tests/unit/introspection.tcl has a long-standing "CONFIG sanity"
test that iterates every config from CONFIG GET * and tries CONFIG
SET on each, with a hand-maintained skip list for immutable configs.
The new compression-cpulist / compression_cpulist pair (IMMUTABLE,
following the bio-cpulist precedent in config.c) wasn't on the skip
list, so the test aborted the whole suite as soon as it reached the
`c` keyspace. Added both name forms to skip_configs, matching the
pattern used by server-cpulist / bio-cpulist / aof-rewrite-cpulist /
bgsave-cpulist.

Verified locally: ./runtest --single unit/introspection now
reports 117/117 passed (previously aborted mid-file).

(2) clang-format-check (1 job)
------------------------------
Seven files had formatting that differed from clang-format-18
output: src/compression.c, compression.h, compression_header.h,
compression_registry.h, compression_workers.h, rdb.h, server.h. I
don't have clang-format-18 installed locally, so I pulled the
base64-encoded diff from the CI log and applied it verbatim. Net
effect: multi-line string-literal continuations re-indented,
hand-aligned type columns collapsed to single space, declaration
alignments removed. No semantic change.

(3) Spellcheck (1 job)
----------------------
Eight typo hits across two sources:

  - .agents/planning/realtime-data-compression/DESIGN_TODO.md
    (6 hits: spliting, exicting, accomodate, nessesarily,
    backgorund, immidiately) — these are inside verbatim quoted
    PR-review comments from the design walkthrough. Editing them
    would rewrite the historical audit trail the previous commit
    (7c00e9f) explicitly preserved. Instead, added DESIGN_TODO.md
    and pr-feedback.json to .config/typos.toml extend-exclude —
    both are planning artifacts not shipped with the server, both
    contain quoted reviewer text, and future runs of the
    normalizer tool (normalize-pr-comments.py) will regenerate
    them verbatim from GitHub anyway.

  - summary.md and implementation/plan.md (2 hits of 'compileable'
    → 'compilable'): my own typos; fixed in place.

Verification
------------
  make BUILD_ZSTD=yes                                 OK
  ./runtest --single unit/type/compression            10/10 passed
  ./runtest --single unit/introspection               117/117 passed

The three remaining CI scopes (clang-format, spellcheck) cannot be
verified locally without installing clang-format-18 and typos, but
both fixes are exact-diff applications of the CI tools' own output.
createStringConfig("compression-cpulist", "compression_cpulist", ...)
registered the config under two names. Both names are literal dict
entries (see registerConfigValue in config.c); no underscore-to-hyphen
auto-normalization happens. CONFIG GET * returned both forms, so the
introspection.tcl CONFIG-sanity skip list had to list both.

The underscore alias on the existing *-cpulist family (server-cpulist,
bio-cpulist, aof-rewrite-cpulist, bgsave-cpulist) is purely backward
compatibility for pre-Valkey Redis config files that used the
underscore naming convention. That justification does not carry over
to a brand-new config that has never existed before.

Evidence for "modern Valkey convention is NULL alias":
  183 configs in config.c pass NULL as alias
    6 configs pass an explicit underscore alias
  The 6 are all Redis-era configs.

Fix:
  - config.c: alias arg for compression-cpulist becomes NULL
  - introspection.tcl: drop the now-nonexistent compression_cpulist
    from skip_configs (compression-cpulist remains)

Verified: CONFIG GET 'compression*cpulist' now returns one entry, not
two. Build clean; unit/type/compression and unit/introspection both
pass.
Comment thread src/compression.c

@GilboaAWS GilboaAWS May 13, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we rename this file to compression_manager.c instead? This one handing the init and offer the API to do some work to both sides.

Comment thread src/compression_header.h Outdated
#include <stddef.h>
#include <stdint.h>

#define COMPRESSION_HEADER_MAGIC 0x5A444943u /* "ZDIC" */

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the magic needed here? To determine if it's a compressed value?

@ikolomi ikolomi merged commit 8d3859b into unstable May 17, 2026
80 checks passed
ikolomi added a commit that referenced this pull request May 24, 2026
Addresses 5 of 6 review comments on the QSBR design. Comment #6
(`compressionJob.key` extra-lookup concern) is explicitly deferred to
a follow-up PR per reviewer guidance.

Comment #1 (line 428) and #5 (line 544) — drop language-comparison
framing:
  Removed all references to Rust / `Arc<T>` / "memory-safe languages"
  / `shared_ptr` from §4.4 intro, the "Why QSBR" bullet list, and the
  §4.6 "Why the worker loads the active dict itself" paragraph. The
  rationale now stands on its own technical merit (decoupling the
  registry from worker hot paths; minimal worker contract;
  safe-directional failure modes) rather than via comparison to
  another language's type system. C with explicit protocols is the
  right tool for this problem; the comparison added rhetorical
  weight without adding signal.

Comment #2 (line 326) — duplication with R2.11.4:
  §3.3 Separation invariants restated the worker contract that
  R2.11.4 already specifies authoritatively. Slimmed the §3.3 bullet
  to a one-liner that points at §2.11 R2.11.4 and §4.4. Eliminates
  drift risk between the two places.

Comment #3 (line 439) — bound the retiring list, block on cap:
  Added new step 7 to the QSBR section explaining the cap interaction
  with R2.3.3. The retiring list is a subset of `dicts[]`, capped at
  `compression-dict-max-versions`. When grace-barrier draining cannot
  keep up (worker starvation, persistent `frame_refs > 0`), the cap
  is reached and BOTH training AND promotion are refused per R2.3.3:
  `LL_WARNING` log entry, `compression_dict_cap_reached` set in INFO,
  operator intervention required (raise cap or run COMPRESSION SWEEP).

Comment #4 (line 449) — grace-barrier wake-up via cond_broadcast:
  The original step 6 proposed enqueueing barrier jobs into the SPMC
  inbox to force idle workers to advance generations. This doesn't
  actually work: under work-stealing semantics a single fast worker
  can drain all barrier jobs while siblings stay asleep on the cond
  var. Rewrote step 6 to use a wake-all primitive built on
  `pthread_cond_broadcast`, and added a "Wake-all primitive"
  paragraph to §4.6 that describes extending `mutexqueue.h` with two
  new APIs: a broadcast wake-all (for QSBR grace barriers, config
  changes, etc.) and a shutdown-signal variant (for pool teardown).
  Step 6 cross-references §4.6 for the mechanism.

Comment #6 (line 513) — DEFERRED:
  Reviewer flagged that `compressionJob.key` (a `robj *` carried in
  the job) implies the main thread does an additional lookup at
  install time, doubling the per-write lookup cost. The reviewer
  explicitly tagged this as "follow up PR" — addressing it would
  require a redesign of the install-side data flow and is out of
  scope for the QSBR design change. Tracked as an open item; will
  be addressed before code lands for the install path (S2.7 in
  the implementation plan).
ikolomi added a commit that referenced this pull request May 27, 2026
Removes one of the two time-based hot-key skip knobs. The eligibility
predicate's LRU/noeviction branch now compares against a single
threshold (`compression-min-idle-seconds`) instead of two.

Why this matters
----------------

The dual-knob surface — `compression-settle-seconds` ("recent-write
protection") and `compression-min-idle-seconds` ("recent-access
protection") — was introduced via the Thread #18 walkthrough resolution
on the theory that operators benefit from being able to express two
different intents.

In v1 reality this is documentation theater. Both knobs compare against
the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field
is touched on every read AND every write — gated only by
`LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from
write-recency from this single signal. The math always reduces to:

    eligible iff lru_idle_secs(o) >= max(settle, min_idle)

Setting `settle=10, min_idle=120` is identical to the single-knob
`min_idle=120`. Tested every scenario I could construct (different
time scales, sweep cron sequences, heterogeneous workloads, forward-
compat with v2) — none give the dual surface operationally distinct
behavior in v1.

The only non-trivial argument for keeping both was forward-compat: if
v2 adds a per-object write-time field, the dual knob becomes
meaningful. But adding a config in v2 is non-breaking; existing
operators on v1 see no change. Removing later is harder than adding
later.

Plus the dual surface is an active footgun: operators tuning the two
knobs differently expecting different effects get a confusing
no-difference outcome. PR #1 Thread #3 specifically pushed back on
"too many knobs" — that pressure applies here.

Per YAGNI, ship v1 with the single knob. v2 reintroduces a
write-time-specific knob non-breakingly when per-object write-time
tracking lands.

Code changes
------------

- src/server.h: remove `compression_settle_seconds` field.
- src/config.c: remove the `createIntConfig` registration.
- src/compression.c: drop the second `idle_secs >= settle` check
  in `compressionIsEligible`'s LRU branch. Updated the comment block
  to reflect single-signal reality.
- src/unit/test_compression_eligibility.cpp:
    - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob
      max-wins behavior — no longer applicable).
    - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds`
      / `LruZeroThresholdsAcceptImmediately` with single-knob
      equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`,
      `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`).
    - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive`
      and rename to `LfuTimeKnobIsInactive`.
- tests/unit/type/compression.tcl: drop the
  `compression-settle-seconds` config-default assertion; update
  comment from "Advanced (11)" to "Advanced (10)".

Doc changes
-----------

- detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has
  one comparison in the LRU/noeviction branch instead of two. The
  rationale paragraph below the predicate explains the v1 single-
  signal reality and the YAGNI motivation for dropping the second
  knob; future v2 reintroduction noted.
- detailed-design.md §2.12 advanced config table: 11 → 10 knobs;
  `compression-settle-seconds` row removed; `compression-min-idle-
  seconds` description simplified.
- detailed-design.md §7.1 transparency-mode harness config: drop the
  `--compression-settle-seconds 0` line so the harness doesn't pass
  an unknown option.
- idea-honing.md Q6 baseline filter bullet: collapse the two-bullet
  LRU branch into a single bullet; add an _italicized rationale
  paragraph_ explaining why the second knob was dropped (preserves
  the historical thinking for future readers).
- idea-honing.md Q6 consolidated predicate: matches detailed-design.md.
- idea-honing.md Q6 config table: drop the `compression-settle-seconds`
  row.
- idea-honing.md Q6a answer: rewrite to reflect single-knob reality
  with reference to "S2.2 implementation review" so future readers
  can trace this refinement chain (Thread #18 → Thread #19 → S2.2
  refinement).
- idea-honing.md §7.1 harness config: drop the
  `--compression-settle-seconds 0` line.
- implementation/plan.md S2.2 description: simplify to "policy-aware
  hot-key skip" + the actual operator-facing knobs.
- summary.md: update the eligibility table row + walkthrough-
  highlights bullet to reflect the policy-aware single-knob outcome.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally
unchanged — they capture decisions at a point in time. The
walkthrough Thread #18/#19 resolutions stand as written; only the
implementation interpretation in the live design docs is refined.

Verified locally
----------------

- `make -j` builds clean.
- `./runtest --single unit/type/compression` 10/10 passes (the Tcl
  fixture's config-default assertion was updated in lockstep with the
  C-side removal, so the integration test catches any drift between
  src/config.c and tests/unit/type/compression.tcl).

Not verified locally (CI will validate):
- gtest unit tests (no libgtest-dev locally).

Test count delta
----------------
S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove
the dual-knob exercise paths).
ikolomi added a commit that referenced this pull request May 27, 2026
* docs(design): align eligibility predicate; document policy-aware hot-key skip

Two corrections discovered while preparing S2.2 implementation:

1. The two design docs had drifted on minor wording (idea-honing.md
   said `obj->encoding == RAW` and `obj->refcount != SHARED`;
   detailed-design.md said `OBJ_ENCODING_RAW` / `OBJ_SHARED_REFCOUNT`).
   Predicates now match exactly across both docs, using the actual C
   constants from src/server.h. Per the agreed convention: keep the
   exact predicate inline in both docs (different audiences both need
   it readable in place) rather than a cross-reference.

2. The Thread #18/#19 walkthrough resolutions made a claim that
   doesn't match the existing Valkey source: "the existing LRU field
   already provides the signal for every eviction policy" / time-based
   checks "work uniformly across LRU, LFU, and noeviction" /
   `estimateObjectIdleTime()` is "a reliable read-hotness signal
   universally."

   Reading src/lrulfu.h:
     - LRU and noeviction policies: robj->lru is seconds-based.
       lru_idle_secs(o) returns real seconds. Time-based thresholds
       work as designed.
     - LFU policy: robj->lru encodes 16 bits "last eval time in
       minutes" + 8 bits approximate freq counter. There is no
       per-second access timestamp. lru_idle_secs(o) would
       misinterpret the bits. The function `estimateObjectIdleTime()`
       referenced by the design does not exist in current Valkey;
       the closest available helper is `lrulfu_getIdleness()` which
       returns `UINT8_MAX - freq` in LFU mode (a 0..255 freq-derived
       heuristic, NOT seconds).

   The fix is policy-aware checks, not policy-uniform:
     - LRU and noeviction: apply settle-seconds AND min-idle-seconds
       against lru_idle_secs(o).
     - LFU: skip the time-based thresholds entirely (the metric is
       wrong unit). Apply compression-lfu-threshold against the freq
       counter — already in the predicate as the LFU branch.

   The dual-knob operator surface (`compression-settle-seconds` and
   `compression-min-idle-seconds`) is preserved across modes; in
   LRU/noeviction both knobs apply to the same metric (since
   robj->lru is touched on every read AND write — v1 cannot
   distinguish source) so the effective threshold is `max(settle,
   min_idle)`. Operators get to express two intents.

   The Thread #18/#19 resolutions stand as written in DESIGN_TODO.md
   (audit trail; the decision to use the existing LRU field rather
   than add a new write-time field is unchanged); only the
   implementation interpretation in the live design docs is refined.

Updated:
  - detailed-design.md §2.2 R2.2 predicate + new explanatory
    paragraph below it.
  - detailed-design.md §2.12 config table: settle-seconds and
    min-idle-seconds descriptions now correctly note "Inactive in LFU
    mode."
  - idea-honing.md Q6 baseline filter bullet (rewritten as
    "policy-aware" with sub-bullets per policy).
  - idea-honing.md Q6 consolidated predicate (now identical to
    detailed-design.md §2.2).
  - idea-honing.md Q6a answer: rewritten with the policy-aware
    framing; cross-references "S2.2 implementation review" so future
    readers can trace this refinement.

* Inline compression: S2.2 — eligibility predicate

Implements R2.2 / Q6 `compressionIsEligible(robj *o, const sds key)`,
replacing the Phase 0 stub that returned 0 for every value.

The predicate has six gates, evaluated in cheapest-first order so the
master switch short-circuits early when the feature is disabled:

  1. Master switch (server.compression_enabled).
  2. Type + encoding gate. STRING values only (R2.2, Q6c). Of the four
     string encodings, only OBJ_ENCODING_RAW is a candidate:
       - INT      — already memory-optimal.
       - EMBSTR   — ≤44 B, header overhead erases any savings (Threads
                    #17 / #21 explicitly excluded).
       - COMPRESSED — defense-in-depth no-op for double-compress.
  3. Refcount gate. Shared RESP constants are never installed in a db
     (lookupKey asserts this); we mirror the assertion as a safety
     check.
  4. Size bounds. compression-min-value-size (lower) prevents wasting
     CPU on values too small to recoup the per-value header (~16 B).
     compression-max-value-size (upper, 0 = disabled) caps worst-case
     sync-decompression latency on the main thread (~1 µs/KB).
  5. Hot-key skip — POLICY-AWARE per the corrected R2.2:
       - LRU and noeviction: robj->lru is seconds-based. Apply both
         compression-settle-seconds (recent-write proxy) and
         compression-min-idle-seconds (recent-read proxy) against
         lru_getIdleSecs(o->lru). v1 cannot distinguish source
         (robj->lru is touched on read AND write); the dual surface
         lets operators express two intents that share an underlying
         signal. Effective threshold is max(settle, min_idle).
       - LFU: robj->lru encodes a freq counter (no per-second
         timestamp). Time-based knobs are inactive in this mode.
         Apply compression-lfu-threshold against the freq counter via
         lfu_getFrequency(), which mirrors the standard Valkey
         decay-on-read pattern (objectGetLFUFrequency in src/object.c).
         The signature is `robj *o` (not `const robj *`) because of
         this in-place decay.
  6. Incompressible-keys retry guard. Stubbed as "always retry-eligible"
     in S2.2; S2.3 lands the side hashtable and wires
     compressionRetryEligible(key) here.

Test coverage in src/unit/test_compression_eligibility.cpp (16 tests,
auto-discovered by src/unit/Makefile's `wildcard *.cpp`):

  - Master switch off / on.
  - Each rejection branch:
      * non-STRING type
      * INT, EMBSTR, COMPRESSED encodings
      * shared refcount
      * size below min, above max
  - Size-bound boundary cases (at exact min, at exact max,
    max=0 disables upper bound).
  - LRU branch:
      * recent touch (idle < settle) — rejected
      * idle between settle and min_idle (max wins) — rejected
      * cold key (idle >= max(settle, min_idle)) — accepted
      * zero thresholds — accept immediately
  - noeviction policy: same code path as LRU per R2.2.
  - LFU branch:
      * freq at threshold — rejected (>= comparison)
      * freq above threshold — rejected
      * freq below threshold — accepted
      * time-based knobs are inactive even at INT_MAX values.

The fixture saves and restores `server.compression_*` and
`server.maxmemory_policy`, and re-syncs lrulfu's cached
`is_using_lfu_policy` boolean via `lrulfu_updateClockAndPolicy()` on
both setup and teardown so tests don't leak policy state into each
other.

Verified locally:
  - `make -j` builds clean.
  - `./runtest --single unit/type/compression` 10/10 passes (the Phase 0
    integration fixture exercises feature-off semantics; with
    compression-enabled still 0 by default, eligibility is never
    consulted in the integration server).

Not verified locally (CI will validate):
  - gtest unit tests on Linux/macOS/32-bit (no libgtest-dev locally).

S2.3 (incompressible-keys hashtable) wires the last branch and ships
its own gtest coverage. After that, S2.4–S2.10 wire the rest of the
hot path.

* docs(design): rewrite eligibility predicate's hot-key check in branched form

Per review feedback during S2.2 review: the previous form encoded the
policy split using short-circuit booleans —

  && (lfu_mode  || lru_idle_secs(obj) >= compression-settle-seconds)
  && (lfu_mode  || lru_idle_secs(obj) >= compression-min-idle-seconds)
  && (!lfu_mode || lfu_freq(obj) < compression-lfu-threshold)

— which is logically correct but reads awkwardly. Three lines mention
`lfu_mode` (twice unprimed, once primed); the reader has to mentally
short-circuit twice to see that line 1+2 fire only in LRU/noeviction
and line 3 fires only in LFU. It also looks at first glance like the
predicate might be using `compression-lfu-threshold` as an LRU-mode
threshold.

Replaced with a branched helper that mirrors how the C
implementation's if/else branches:

  && hot_key_check(obj)                       // policy-aware

  where hot_key_check(obj) is:
      if lfu_mode:
          lfu_freq(obj) < compression-lfu-threshold
      else:
          lru_idle_secs(obj) >= compression-settle-seconds
          AND lru_idle_secs(obj) >= compression-min-idle-seconds

Same behavior; the implementation in src/compression.c
(`compressionIsEligible()`) already uses this exact branching shape —
the docs now match it visually.

Updated:
  - detailed-design.md §2.2 R2.2 predicate.
  - idea-honing.md Q6 consolidated predicate.

Both docs were already aligned (per the previous predicate-alignment
commit); they remain identical with this rewrite. The explanatory
paragraph below the §2.2 predicate (LRU vs LFU lru-field encoding)
already covers the rationale and is unchanged.

* Inline compression: drop compression-settle-seconds knob (YAGNI)

Removes one of the two time-based hot-key skip knobs. The eligibility
predicate's LRU/noeviction branch now compares against a single
threshold (`compression-min-idle-seconds`) instead of two.

Why this matters
----------------

The dual-knob surface — `compression-settle-seconds` ("recent-write
protection") and `compression-min-idle-seconds` ("recent-access
protection") — was introduced via the Thread #18 walkthrough resolution
on the theory that operators benefit from being able to express two
different intents.

In v1 reality this is documentation theater. Both knobs compare against
the same metric (`lru_idle_secs(o)`) because Valkey's `robj->lru` field
is touched on every read AND every write — gated only by
`LOOKUP_NOTOUCH` and fork. v1 cannot distinguish read-recency from
write-recency from this single signal. The math always reduces to:

    eligible iff lru_idle_secs(o) >= max(settle, min_idle)

Setting `settle=10, min_idle=120` is identical to the single-knob
`min_idle=120`. Tested every scenario I could construct (different
time scales, sweep cron sequences, heterogeneous workloads, forward-
compat with v2) — none give the dual surface operationally distinct
behavior in v1.

The only non-trivial argument for keeping both was forward-compat: if
v2 adds a per-object write-time field, the dual knob becomes
meaningful. But adding a config in v2 is non-breaking; existing
operators on v1 see no change. Removing later is harder than adding
later.

Plus the dual surface is an active footgun: operators tuning the two
knobs differently expecting different effects get a confusing
no-difference outcome. PR #1 Thread #3 specifically pushed back on
"too many knobs" — that pressure applies here.

Per YAGNI, ship v1 with the single knob. v2 reintroduces a
write-time-specific knob non-breakingly when per-object write-time
tracking lands.

Code changes
------------

- src/server.h: remove `compression_settle_seconds` field.
- src/config.c: remove the `createIntConfig` registration.
- src/compression.c: drop the second `idle_secs >= settle` check
  in `compressionIsEligible`'s LRU branch. Updated the comment block
  to reflect single-signal reality.
- src/unit/test_compression_eligibility.cpp:
    - Drop `LruRejectsBetweenSettleAndMinIdle` (test of dual-knob
      max-wins behavior — no longer applicable).
    - Replace `LruRejectsRecentTouch` / `LruAcceptsBeyondBothThresholds`
      / `LruZeroThresholdsAcceptImmediately` with single-knob
      equivalents (`LruRejectsRecentTouch`, `LruAcceptsBeyondThreshold`,
      `LruAtThresholdAcceptsBoundary`, `LruZeroThresholdAcceptsImmediately`).
    - Drop `compression-settle-seconds` from `LfuTimeKnobsAreInactive`
      and rename to `LfuTimeKnobIsInactive`.
- tests/unit/type/compression.tcl: drop the
  `compression-settle-seconds` config-default assertion; update
  comment from "Advanced (11)" to "Advanced (10)".

Doc changes
-----------

- detailed-design.md §2.2 R2.2 predicate: hot_key_check helper now has
  one comparison in the LRU/noeviction branch instead of two. The
  rationale paragraph below the predicate explains the v1 single-
  signal reality and the YAGNI motivation for dropping the second
  knob; future v2 reintroduction noted.
- detailed-design.md §2.12 advanced config table: 11 → 10 knobs;
  `compression-settle-seconds` row removed; `compression-min-idle-
  seconds` description simplified.
- detailed-design.md §7.1 transparency-mode harness config: drop the
  `--compression-settle-seconds 0` line so the harness doesn't pass
  an unknown option.
- idea-honing.md Q6 baseline filter bullet: collapse the two-bullet
  LRU branch into a single bullet; add an _italicized rationale
  paragraph_ explaining why the second knob was dropped (preserves
  the historical thinking for future readers).
- idea-honing.md Q6 consolidated predicate: matches detailed-design.md.
- idea-honing.md Q6 config table: drop the `compression-settle-seconds`
  row.
- idea-honing.md Q6a answer: rewrite to reflect single-knob reality
  with reference to "S2.2 implementation review" so future readers
  can trace this refinement chain (Thread #18 → Thread #19 → S2.2
  refinement).
- idea-honing.md §7.1 harness config: drop the
  `--compression-settle-seconds 0` line.
- implementation/plan.md S2.2 description: simplify to "policy-aware
  hot-key skip" + the actual operator-facing knobs.
- summary.md: update the eligibility table row + walkthrough-
  highlights bullet to reflect the policy-aware single-knob outcome.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally
unchanged — they capture decisions at a point in time. The
walkthrough Thread #18/#19 resolutions stand as written; only the
implementation interpretation in the live design docs is refined.

Verified locally
----------------

- `make -j` builds clean.
- `./runtest --single unit/type/compression` 10/10 passes (the Tcl
  fixture's config-default assertion was updated in lockstep with the
  C-side removal, so the integration test catches any drift between
  src/config.c and tests/unit/type/compression.tcl).

Not verified locally (CI will validate):
- gtest unit tests (no libgtest-dev locally).

Test count delta
----------------
S2.2 gtest: 16 tests → 14 tests (dropped 2, simplified 2 to remove
the dual-knob exercise paths).

* Cleanup: untrack proposal-issue.md; mark S2.2 complete in plan.md

Two small fixes to the previous commit's collateral:

1. proposal-issue.md was inadvertently committed via `git add -A` in
   the previous commit. The file is a working draft of the upstream
   issue (already tracked in the valkey-io issue tracker) and doesn't
   belong in the planning directory. Removing.

2. plan.md still showed S2.2 as `[ ]`. Implementation-complete state
   matches the S2.1 marking convention (`[x]` once the task ships); on
   merge to unstable the marking becomes definitive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants