Skip to content

Inline compression: drop dict-scoped retry guard (S2.3 superseded)#10

Merged
ikolomi merged 1 commit into
unstablefrom
ikolomi/drop-incompressible-keys
May 28, 2026
Merged

Inline compression: drop dict-scoped retry guard (S2.3 superseded)#10
ikolomi merged 1 commit into
unstablefrom
ikolomi/drop-incompressible-keys

Conversation

@ikolomi

@ikolomi ikolomi commented May 28, 2026

Copy link
Copy Markdown
Owner

Drops the incompressibleKeys{} side hashtable design from S2.3 and the compression-retry-interval config knob; refines the existing drift signal in R2.3.5 to include rejected attempts in the compression_live_ratio_10m rolling average. No new config knob.

Why this matters

Under a fixed dict and a fixed workload, the probability that any given write at key K produces compressible bytes is constant — time- based throttling between retries doesn't shift this probability, only spaces attempts. CPU bounding is already provided at the sweep level (compression-sweep-max-cpu-pct); per-key throttling would be redundant and would add correctness burden (every mutating code path — dbOverwrite, APPEND, SETRANGE, BITOP, BITFIELD, module DMA writes — would need a Clear integration point to avoid suppressing compression of legitimately-changed values).

The right trigger for "the rejection might now compress" is dict change, which is handled at the system level. Updated definition: compression_live_ratio_10m is now the rolling ratio over compression attempts (successful + rejected) in the 10 min window, weighted by uncompressed bytes. Each successful compression contributes its actual compressed/uncompressed ratio. Each rejection contributes its actual measured ratio (typically in [0.9, 1.05] since the net-savings guard rejected it for being too close to 1.0). Worker errors are excluded — they're tracked separately via compression_errors_total.

Both forms of degradation feed the metric uniformly:

  • workload-content drift among compressible values, and
  • workload composition drift (more values compressing poorly enough to fail the net-savings guard).

A single drift trigger threshold catches both scenarios. Operators reason about ONE metric; the design has ONE config knob (compression-dict-drift-ratio) for the threshold.

This is a deliberate simplification over an earlier design (PR #10) which proposed a per-key incompressibleKeys side hashtable with compression-retry-interval time fallback. That approach added a module + a config knob + integration points in every mutating code path, but the underlying assumption (waiting between retries improves the per-attempt hit rate) is incorrect for fixed-distribution workloads under a fixed dict. The simpler model — retry on every sweep tick, let the drift signal handle systemic dict-fit issues — is correct and operationally cleaner.

Drift comparison direction fixed

R2.3.5's existing trigger had the comparison backwards. The ratio convention is compressed/uncompressed (lower = more savings, per R2.10.1). The previous formulation —

fires when live_ratio < drift_ratio × post_training_ratio

— fires when live is BETTER (lower) than 70% of post-training, i.e., when compression IMPROVES over post-training. That's not drift.

Corrected to:

fires when live_ratio > post_training_ratio / drift_ratio

With drift_ratio = 0.7 (70%): fires when live exceeds post-training by a factor of 1/0.7 ≈ 1.43 — i.e., compression efficiency has degraded by about 43%. Default value preserved; only the comparison direction changed.

Why "real measured ratio" not "1.0" for rejections --------------------------------------------------

An earlier draft of this commit substituted ratio = 1.0 for every rejected attempt, on the theory that we store such values uncompressed so the effective storage savings is zero. That was reviewed and revised: the metric's purpose is drift detection (workload- compressibility under this dict), not effective-storage tracking. A rejected value at 0.91 (just above the 90% net-savings threshold) and a truly incompressible value at 1.05 carry different information about dict-workload fit. Substituting 1.0 for both throws away that distinction; using the actual measured ratio preserves it. The data is right there — the worker reports the ratio when it returns, we just record what was measured.

Code changes

  • src/compression.c (compressionIsEligible):
    • Drop branch 6 (the S2.2-stub "always retry-eligible" comment + the S2.3-anticipated lookup that never landed).
    • Drop the key parameter from the signature — no longer needed now that no branch consumes it. Trivial to re-add when a future branch (e.g., S2.7 write-path hook) wants the key for logging.
    • Update doc comment to reflect the simpler predicate.
  • src/compression.h: matching signature change.
  • src/server.h: drop compression_retry_interval field.
  • src/config.c: drop compression-retry-interval registration. Advanced knobs count goes 10 → 9.
  • src/unit/test_compression_eligibility.cpp: drop key parameter from all test calls.
  • tests/unit/type/compression.tcl: drop the compression-retry-interval config-default assertion; update comment from "Advanced (10)" to "Advanced (9)".

Doc changes

  • detailed-design.md §2.2:
    • Predicate: drop retry_eligible(obj) line.
    • Rationale paragraph rewritten to use the probability argument: time-throttling doesn't change per-attempt success probability, only spaces attempts; sweep pacing already bounds CPU. Per-key tracking is functionless and adds integration burden.
  • detailed-design.md §2.3 R2.3.5 drift triggers:
    • Single trigger formulation, comparison direction fixed (< → >). Definition of compression_live_ratio_10m made explicit: rolling ratio over attempts (successful + rejected) using actual measured ratios; worker errors excluded.
  • detailed-design.md §2.12 advanced config table:
    • compression-retry-interval removed (10 → 9).
    • compression-dict-drift-ratio description updated to reflect corrected formula.
  • detailed-design.md §6.6 Net-savings guard failure: rewritten with the probability-argument rationale and real-ratio framing.
  • detailed-design.md §7.1 transparency-mode harness config: drop --compression-retry-interval 0 line.
  • idea-honing.md Q6 baseline filter "Skip post-compression" bullet: matching rewrite (probability argument, real-ratio framing).
  • idea-honing.md Q6 consolidated predicate: drop retry_eligible(obj) line and the where retry_eligible(obj) is... block.
  • idea-honing.md Q6 config table: drop compression-retry-interval row.
  • idea-honing.md Q6b answer: matching rewrite with reference to the S2.3 implementation review (PR Inline compression: drop dict-scoped retry guard (S2.3 superseded) #10 design discussion) so the trace is preserved.
  • idea-honing.md §7.1 harness: drop the same flag.
  • summary.md: update eligibility row + walkthrough-highlights bullet to note the post-walkthrough refinement.
  • plan.md S2 row description: drop "incompressible-keys hashtable" from S2's scope summary.
  • plan.md S2.2: drop the trailing "incompressible-keys hashtable check" claim.
  • plan.md S2.3: marked "(REMOVED)" with explanation; struck-through.
  • plan.md S1.4: extended description to include the rolling-ratio semantics extension (rejections contribute their actual measured ratio) — the drift-detection work absorbed from removed S2.3.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally unchanged — Thread #20's resolution stands as written; only the implementation interpretation in the live design docs is refined.

Branch ikolomi/s2-incompressible (the abandoned S2.3 implementation) should be closed/deleted on GitHub.

Verified locally

  • make -j builds clean.
  • ./runtest --single unit/type/compression 10/10 passes.

Not verified locally (CI will validate):

  • gtest unit tests on Linux/macOS/32-bit (no libgtest-dev locally).

Drops the `incompressibleKeys{}` side hashtable design from S2.3 and
the `compression-retry-interval` config knob; refines the existing
drift signal in R2.3.5 to include rejected attempts in the
`compression_live_ratio_10m` rolling average. No new config knob.

Why this matters
----------------

Under a fixed dict and a fixed workload, the probability that any
given write at key K produces compressible bytes is constant — time-
based throttling between retries doesn't shift this probability, only
spaces attempts. CPU bounding is already provided at the sweep level
(`compression-sweep-max-cpu-pct`); per-key throttling would be
redundant and would add correctness burden (every mutating code path
— `dbOverwrite`, APPEND, SETRANGE, BITOP, BITFIELD, module DMA writes
— would need a `Clear` integration point to avoid suppressing
compression of legitimately-changed values).

The right trigger for "the rejection might now compress" is **dict
change**, which is handled at the system level. Updated definition:
`compression_live_ratio_10m` is now the rolling ratio over compression
*attempts* (successful + rejected) in the 10 min window, weighted by
uncompressed bytes. Each successful compression contributes its actual
`compressed/uncompressed` ratio. Each rejection contributes its actual
measured ratio (typically in `[0.9, 1.05]` since the net-savings guard
rejected it for being too close to 1.0). Worker errors are excluded —
they're tracked separately via `compression_errors_total`.

Both forms of degradation feed the metric uniformly:
  - workload-content drift among compressible values, and
  - workload composition drift (more values compressing poorly enough
    to fail the net-savings guard).

A single drift trigger threshold catches both scenarios. Operators
reason about ONE metric; the design has ONE config knob
(`compression-dict-drift-ratio`) for the threshold.

This is a deliberate simplification over an earlier design (PR #10)
which proposed a per-key `incompressibleKeys` side hashtable with
`compression-retry-interval` time fallback. That approach added a
module + a config knob + integration points in every mutating code
path, but the underlying assumption (waiting between retries
improves the per-attempt hit rate) is incorrect for fixed-distribution
workloads under a fixed dict. The simpler model — retry on every
sweep tick, let the drift signal handle systemic dict-fit issues — is
correct and operationally cleaner.

Drift comparison direction fixed
--------------------------------

R2.3.5's existing trigger had the comparison backwards. The ratio
convention is `compressed/uncompressed` (lower = more savings, per
R2.10.1). The previous formulation —

    fires when live_ratio < drift_ratio × post_training_ratio

— fires when live is BETTER (lower) than 70% of post-training, i.e.,
when compression IMPROVES over post-training. That's not drift.

Corrected to:

    fires when live_ratio > post_training_ratio / drift_ratio

With drift_ratio = 0.7 (70%): fires when live exceeds post-training
by a factor of 1/0.7 ≈ 1.43 — i.e., compression efficiency has
degraded by about 43%. Default value preserved; only the comparison
direction changed.

Why "real measured ratio" not "1.0" for rejections
--------------------------------------------------

An earlier draft of this commit substituted `ratio = 1.0` for every
rejected attempt, on the theory that we store such values uncompressed
so the effective storage savings is zero. That was reviewed and
revised: the metric's purpose is **drift detection** (workload-
compressibility under this dict), not effective-storage tracking. A
rejected value at 0.91 (just above the 90% net-savings threshold) and
a truly incompressible value at 1.05 carry different information about
dict-workload fit. Substituting 1.0 for both throws away that
distinction; using the actual measured ratio preserves it. The data
is right there — the worker reports the ratio when it returns, we
just record what was measured.

Code changes
------------

- src/compression.c (`compressionIsEligible`):
    - Drop branch 6 (the S2.2-stub "always retry-eligible" comment +
      the S2.3-anticipated lookup that never landed).
    - Drop the `key` parameter from the signature — no longer needed
      now that no branch consumes it. Trivial to re-add when a future
      branch (e.g., S2.7 write-path hook) wants the key for logging.
    - Update doc comment to reflect the simpler predicate.
- src/compression.h: matching signature change.
- src/server.h: drop `compression_retry_interval` field.
- src/config.c: drop `compression-retry-interval` registration.
  Advanced knobs count goes 10 → 9.
- src/unit/test_compression_eligibility.cpp: drop `key` parameter
  from all test calls.
- tests/unit/type/compression.tcl: drop the
  `compression-retry-interval` config-default assertion; update
  comment from "Advanced (10)" to "Advanced (9)".

Doc changes
-----------

- detailed-design.md §2.2:
    - Predicate: drop `retry_eligible(obj)` line.
    - Rationale paragraph rewritten to use the probability argument:
      time-throttling doesn't change per-attempt success probability,
      only spaces attempts; sweep pacing already bounds CPU. Per-key
      tracking is functionless and adds integration burden.
- detailed-design.md §2.3 R2.3.5 drift triggers:
    - Single trigger formulation, comparison direction fixed
      (< → >). Definition of `compression_live_ratio_10m` made
      explicit: rolling ratio over attempts (successful + rejected)
      using actual measured ratios; worker errors excluded.
- detailed-design.md §2.12 advanced config table:
    - `compression-retry-interval` removed (10 → 9).
    - `compression-dict-drift-ratio` description updated to reflect
      corrected formula.
- detailed-design.md §6.6 Net-savings guard failure: rewritten with
  the probability-argument rationale and real-ratio framing.
- detailed-design.md §7.1 transparency-mode harness config: drop
  `--compression-retry-interval 0` line.
- idea-honing.md Q6 baseline filter "Skip post-compression" bullet:
  matching rewrite (probability argument, real-ratio framing).
- idea-honing.md Q6 consolidated predicate: drop `retry_eligible(obj)`
  line and the `where retry_eligible(obj) is...` block.
- idea-honing.md Q6 config table: drop `compression-retry-interval`
  row.
- idea-honing.md Q6b answer: matching rewrite with reference to the
  S2.3 implementation review (PR #10 design discussion) so the trace
  is preserved.
- idea-honing.md §7.1 harness: drop the same flag.
- summary.md: update eligibility row + walkthrough-highlights bullet
  to note the post-walkthrough refinement.
- plan.md S2 row description: drop "incompressible-keys hashtable"
  from S2's scope summary.
- plan.md S2.2: drop the trailing "incompressible-keys hashtable
  check" claim.
- plan.md S2.3: marked "(REMOVED)" with explanation; struck-through.
- plan.md S1.4: extended description to include the rolling-ratio
  semantics extension (rejections contribute their actual measured
  ratio) — the drift-detection work absorbed from removed S2.3.

Audit-trail files (DESIGN_TODO.md, pr-feedback.json) intentionally
unchanged — Thread #20's resolution stands as written; only the
implementation interpretation in the live design docs is refined.

Branch ikolomi/s2-incompressible (the abandoned S2.3 implementation)
should be closed/deleted on GitHub.

Verified locally
----------------

- `make -j` builds clean.
- `./runtest --single unit/type/compression` 10/10 passes.

Not verified locally (CI will validate):
- gtest unit tests on Linux/macOS/32-bit (no libgtest-dev locally).
@ikolomi ikolomi force-pushed the ikolomi/drop-incompressible-keys branch from b412122 to 0bb3556 Compare May 28, 2026 17:34
@ikolomi ikolomi merged commit a6fa453 into unstable May 28, 2026
79 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant