Skip to content

[S2.8] Read-path skeleton: LOOKUP_READ_BYTES flag + beforeSleep hook (1/3)#22

Merged
ikolomi merged 3 commits into
unstablefrom
ikolomi/s2-read-path-skeleton
Jun 8, 2026
Merged

[S2.8] Read-path skeleton: LOOKUP_READ_BYTES flag + beforeSleep hook (1/3)#22
ikolomi merged 3 commits into
unstablefrom
ikolomi/s2-read-path-skeleton

Conversation

@ikolomi

@ikolomi ikolomi commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

PR 1 of 3 in the S2.8 split (read-path hook for the transient-decompression model — R2.5.7 + Appendix E).

This CR ships the skeleton only: the LOOKUP_READ_BYTES flag declaration, the compressionBeforeSleep() lifecycle hook (paired with compressionAfterSleep()), and the beforeSleep wiring. No behavior changecompressionBeforeSleep() is an empty stub today.

The intent is to vet the API surface and event-loop hook in a small, focused review before the substantive logic ships in PR 2.

What's in this CR

File Change
src/server.h Declare LOOKUP_READ_BYTES (1 << 5) next to the existing LOOKUP_* flags. Documented semantics: caller intends to read value bytes; the lookup helper transparently decompresses if the value is compressed. Callers that only check existence or read metadata (OBJECT ENCODING, DEBUG OBJECT, eviction sampler) MUST omit the flag.
src/compression.h Declare compressionBeforeSleep() next to compressionAfterSleep(). Symmetric naming — umbrella for "compression work at the beforeSleep boundary."
src/compression.c Empty compressionBeforeSleep() with a TODO(S2.8-activate) marker describing PR 2's activation work.
src/server.c Call compressionBeforeSleep() early in beforeSleep, mirroring where compressionAfterSleep() lives in afterSleep.

Net diff: 4 files, +43 lines.

What's NOT in this CR

  • Side-map data structure + management (the per-server transient-view registry) — PR 2.
  • Actual decompression call from inside lookupKey — PR 2.
  • LOOKUP_READ_BYTES annotations on the ~30 byte-reading lookupKey* callers — PR 2.
  • Explicit objectGetUncompressedView calls in rewriteAppendOnlyFileRio (AOF rewrite child) and the RDB save path used for replication full-sync (R2.6.8) — PR 3.

The split rationale is documented in the SESSION_CHECKPOINT for this work; option C (3-way split) was chosen to keep each CR small.

Verified

  • make -j2 -C src SERVER_CFLAGS=-Werror → clean (no warnings).
  • ./runtest --single unit/type/compression → 10/10 pass.

Naming choice — compressionBeforeSleep() vs. compressionRestoreTouchedKeys()

Briefly considered the more descriptive name compressionRestoreTouchedKeys(), but went with compressionBeforeSleep() for symmetry with compressionAfterSleep(). The lifecycle-hook name keeps the public API stable as future work (drift-trigger, training submission) potentially absorbs the same hook; descriptive helper names can live as static functions inside compression.c if the body grows large enough to warrant it.

Skeleton CR for the S2.8 transient-decompression model (R2.5.7 +
Appendix E in detailed-design.md). Establishes the API surface and
lifecycle hook; no behavior change yet.

Changes:
- src/server.h: declare LOOKUP_READ_BYTES (1 << 5). Caller passes it
  on lookupKey* when it intends to read value bytes; lookups that
  only check existence or read encoding metadata (OBJECT ENCODING,
  DEBUG OBJECT, eviction sampler) MUST omit the flag so the
  truthful "compressed" encoding stays visible.
- src/compression.h: declare compressionBeforeSleep() — symmetric to
  compressionAfterSleep(). Umbrella for "compression work at
  beforeSleep time"; today only restoration, future-proof for
  drift-trigger / training-submission work that may want the same
  hook.
- src/compression.c: empty compressionBeforeSleep() with TODO(S2.8-
  activate) marker describing the activation work that lands in PR 2
  of the S2.8 split.
- src/server.c: wire compressionBeforeSleep() early in beforeSleep,
  after trySendPollJobToIOThreads (mirror to where compressionAfter
  Sleep lives in afterSleep).

Out of scope for this CR (PR 1 of 3):
- Side-map data structure + management — lands in PR 2 (activate)
  alongside the actual decompression call from inside lookupKey.
- LOOKUP_READ_BYTES flag annotations on the ~30 byte-reading lookup
  callers — also PR 2.
- Explicit objectGetUncompressedView calls in AOF rewrite child
  (rewriteAppendOnlyFileRio) and RDB save for replication full-sync
  (R2.6.8) — PR 3.

Verified locally:
  make -j2 -C src SERVER_CFLAGS=-Werror   → clean
  ./runtest --single unit/type/compression → 10/10 pass
@ikolomi ikolomi requested a review from GilboaAWS June 8, 2026 15:45
ikolomi added 2 commits June 8, 2026 19:08
clang-format-18 reformatted my multi-line `\`-continuation comment
into a hard-to-read indented form. Switched to a block comment
above the `#define` — clang-format leaves it alone, reads better.

No functional change.
Adds a deferred-pointer-capture sub-section to Appendix E covering
sites that stash a pointer derived from val_ptr into a structure
that survives across event-loop boundaries. The original Appendix E
sweep only covered synchronous byte readers (sites matching
sdslen(objectGetVal(o))) and missed deferred-capture sites.

Findings:

- One unsafe site in core: _addBulkStrRefToBufferOrList (the bulk-
  reply zero-copy path used by tryAvoidBulkStrCopyToReply). Captures
  bulkStrRef = {.obj, .str = objectGetVal(obj)} for the IO thread to
  dereference at write time. Under naive transient view this is a
  use-after-free at IO-thread write time when beforeSleep frees the
  temp sds before the IO thread runs.

- Module API (VM_StringPtrLen/StringDMA/RetainKey) is governed by a
  pre-existing API contract ("pointer valid until next yield");
  modules respecting the contract are safe.

- All other deferred-capture-shaped sites (replication feed, AOF
  feed, addReplyBulkSds/CBuffer/CString, dumpCommand, migrateCommand,
  AOF rewrite child, RDB save, lazy free) are confirmed safe — they
  either copy bytes or operate on argv/synthetic robjs.

Resolution (lands in PR 2 of S2.8 split):

- Add transientViewActive(obj) — O(1) side-map presence check.
- Extend isCopyAvoidPreferred() to return 0 when transientViewActive
  is true. Forces the memcpy reply path for transient values; bytes
  land in c->reply (independent of val_ptr), safe to free temp sds
  at beforeSleep. Cost: one memcpy bounded by compression-max-value-
  size (128 KiB → ~30 µs).

R2.5.7 in §2.5 gets a one-paragraph pointer to E.7. The TODO comment
in compressionBeforeSleep() is extended to call out the force-copy
work as a PR 2 requirement.

Skeleton CR (this branch) is unaffected — compressionBeforeSleep is
still a no-op stub; no robj is ever in transient state today.
@ikolomi ikolomi merged commit 2297e27 into unstable Jun 8, 2026
79 of 80 checks passed
ikolomi added a commit that referenced this pull request Jun 9, 2026
The S2.8 forward-reference in plan.md previously named the
out-of-process explicit-decompression work as 'PR 3 of 3 — NEXT'.
This PR closes that out: PR #24 IS that work. Update the entry to
reference PR #24 and capture the as-implemented scope (which paths,
which deferred to S3.1/S3.2 for R2.6.1 disk-RDB compressed encoding,
which already covered by PR #23's transient-view hook).

S2.8 (Read-path hook) is now fully landed across PRs #21, #22, #23,
and #24. Next on the @ikolomi track: S2.9, S2.10, S2.11. Next on the
@GilboaAWS track: S1.2 (training) — now safe to merge because S2.8
PR 3 means the forked-child paths (AOF rewrite, RDB save) no longer
panic when they encounter compressed values.
ikolomi added a commit that referenced this pull request Jun 9, 2026
* [S2.8] Out-of-process explicit decompression (3/3)

PR 3 of 3 in the S2.8 split. Wires explicit objectGetUncompressedView
calls into the two paths that bypass lookupKey() and thus do not
benefit from PR #23's transient-view hook:

  - rioWriteBulkObject (aof.c) — AOF rewrite child iterates kvstore
    via kvstoreIteratorNext; AOF on-disk format is uncompressed RESP
    per R2.6.5.
  - rdbSaveStringObject (rdb.c) — same iteration pattern; per R2.6.5
    (AOF preamble within RDB) and R2.6.8 (full-sync replication RDB),
    uncompressed bytes are mandatory.

Both today panic on OBJ_ENCODING_COMPRESSED (rioWriteBulkObject:
serverPanic("Unknown string encoding"); rdbSaveStringObject:
serverAssertWithInfo on sdsEncodedObject). The crash is latent only
because no dictionary training has merged yet (S1.2 pending) — once
training lands, BGSAVE/BGREWRITEAOF on a compressed key crashes the
server. PR 3 is a prerequisite for S1.2 keeping unstable continuously
safe.

R2.6.1 (compressed disk RDB encoding via RDB_ENC_COMPRESSED + AUX
dict entries) is intentionally deferred to S3.1/S3.2 (@GilboaAWS).
Until that lands, the conservative choice for ALL RDB targets is
uncompressed, matching pre-feature behavior. Disk RDBs pay the
decompression cost once at save time and re-acquire compression at
load time via the post-S1.2 sweeper.

Dict lifetime safety in the forked child: registry is a fork-time
snapshot; DDicts are immutable; child uses its own copy of the
registry pointers. No interaction with the parent's QSBR generation
counters.

Replication feed (feedReplicationBufferWithObject) does NOT need
wiring — already verified to operate on argv arguments and synthetic
SELECT robjs only, never on kvstore values.

DUMP/RESTORE/MIGRATE (R2.6.7) go through lookupKey() which now
decompresses transparently per S2.8 PR 2; nothing to change.

Tests deferred to the post-S1.2 transparency Tcl harness — that's
where real compressed values exist and BGSAVE/BGREWRITEAOF
round-trips can be exercised end-to-end.

Verified locally: make -j2 -C src SERVER_CFLAGS=-Werror clean both
with BUILD_ZSTD=yes (default) and BUILD_ZSTD=no; compression Tcl
tests 10/10 pass.

* docs(planning): mark S2.8 fully complete via PR #24 (3/3)

The S2.8 forward-reference in plan.md previously named the
out-of-process explicit-decompression work as 'PR 3 of 3 — NEXT'.
This PR closes that out: PR #24 IS that work. Update the entry to
reference PR #24 and capture the as-implemented scope (which paths,
which deferred to S3.1/S3.2 for R2.6.1 disk-RDB compressed encoding,
which already covered by PR #23's transient-view hook).

S2.8 (Read-path hook) is now fully landed across PRs #21, #22, #23,
and #24. Next on the @ikolomi track: S2.9, S2.10, S2.11. Next on the
@GilboaAWS track: S1.2 (training) — now safe to merge because S2.8
PR 3 means the forked-child paths (AOF rewrite, RDB save) no longer
panic when they encounter compressed values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant