Skip to content

feat(lite): default append pipelining with durability-gated acks#289

Merged
shikhar merged 9 commits intomainfrom
gh-48
Mar 6, 2026
Merged

feat(lite): default append pipelining with durability-gated acks#289
shikhar merged 9 commits intomainfrom
gh-48

Conversation

@shikhar
Copy link
Member

@shikhar shikhar commented Mar 3, 2026

Default append pipelining in s2-lite and remove the S2LITE_PIPELINE gate (issue #48).

This PR now includes the full durability/ack rework needed to make pipelining safe by default, plus broader wiring and test coverage.

What changed

  • Decouple append submission from durability:
    • submit write batches with await_durable=false
    • track in-flight appends by SlateDB write seqnum
    • advance ack/tail/follower visibility only when durable seq advances past each batch
  • Add a shared backend DurabilityNotifier:
    • single db.subscribe() task fan-outs durability progress to waiter callbacks
    • replaces per-streamer durability watchers
  • Keep DB write submission serialized while allowing multiple in-flight appends awaiting durability.
  • Keep trim/fence sequencing safe by deriving emitted write metadata from CommandState applied points.
  • Replace the env gate with explicit in-flight budgeting:
    • remove S2LITE_PIPELINE
    • add --append-inflight-bytes (default 128MiB)
    • use a backend-wide semaphore so metered append bytes are budgeted across streams
  • Remove README text that documented pipelining as disabled by default.

Tests

  • Added streamer unit tests for durability-gated ack release and durable-seq jumps.
  • Added append-session integration coverage asserting ack/tail monotonicity and read order under pipelining.

Dependency updates

  • tokio 1.49 -> 1.50
  • lockfile updates including SlateDB 0.11.1

Refs #48

@shikhar shikhar force-pushed the gh-48 branch 3 times, most recently from 73b3ef5 to e8db6a1 Compare March 4, 2026 21:43
@shikhar shikhar changed the title feat(lite): enable safe append pipelining by default feat(lite): default append pipelining with durability-gated acks Mar 4, 2026
@shikhar shikhar marked this pull request as ready for review March 4, 2026 23:16
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

This PR enables append pipelining by default in s2-lite, completing the work tracked in issue #48. It decouples DB write submission (now using await_durable: false) from durability acknowledgement, introduces a shared DurabilityNotifier that fans out SlateDB's watch-channel status to per-append callbacks, and replaces the per-streamer semaphore with a backend-wide one — allowing backpressure to be budgeted across all streams via the new --append-inflight-bytes flag (default 128 MiB).

Key points from the review:

  • Architecture is sound: serialised write submission + parallel durability waiting is the correct design for pipelining safety; CommandState applied-point tracking correctly gates trim/fence writes.
  • DurabilityNotifier implementation is clean: callbacks are always fired outside the lock, waiter insertion maintains sorted order, and all edge cases (already-durable, already-closed, out-of-order subscriptions) are covered by tests.
  • Implicit semaphore capacity invariant: the old append_permit code clamped num_permits to the semaphore's max capacity to prevent acquire_many from blocking forever. The new code drops this clamp, relying implicitly on RECORD_BATCH_MAX bounding incoming batch sizes and on the semaphore being created with at least RECORD_BATCH_MAX capacity. A debug_assert! would make this assumption explicit.
  • --append-inflight-bytes lacks an env = attribute: unlike other LiteArgs fields, there is no env-var alias, which may surprise operators who previously configured behaviour via S2LITE_PIPELINE.
  • Redundant log line: args.append_inflight_bytes is logged twice in server.rs; the first (message-less) info! call appears to be a leftover.
  • Test coverage is thorough: unit tests for DurabilityNotifier, new Streamer unit tests for durability-gated ack ordering and durable-seq jumps, and an integration test asserting ack/tail monotonicity and read order under full pipelining.

Confidence Score: 4/5

  • PR is safe to merge; the pipelining logic is correct and well-tested, with only minor style concerns remaining.
  • The core durability-gating design is sound: writes are serialised, acks are only released after the DB confirms durability, and the notifier correctly fans out to all waiters. The implicit semaphore capacity invariant (removed upper-bound clamp) is not a practical risk given server-side batch size validation, but it is undocumented. The missing env-var alias and duplicate log line are cosmetic. No correctness or security issues found.
  • lite/src/backend/streamer.rs (implicit permit-cap invariant) and lite/src/server.rs (missing env var, duplicate log)

Important Files Changed

Filename Overview
lite/src/backend/durability_notifier.rs New file implementing a shared fan-out notifier for DB durability progress; callbacks are correctly called outside the lock, sorted waiter insertion is well-tested, and the run_notifier task terminates cleanly when the DB closes.
lite/src/backend/streamer.rs Core pipelining rework: decouples DB write submission from durability, tracking in-flight appends with a VecDeque; semaphore permit acquisition removed its upper-bound clamp, now relying implicitly on RECORD_BATCH_MAX to prevent acquire_many deadlock.
lite/src/backend/core.rs Backend now creates a shared semaphore (clamped to at least RECORD_BATCH_MAX) and DurabilityNotifier, passing both to each spawned Streamer; straightforward refactor.
lite/src/server.rs Replaces S2LITE_PIPELINE env-var gate with --append-inflight-bytes CLI flag (default 128MiB); contains a duplicate log of append_inflight_bytes and lacks an env= attribute for operator convenience.
lite/tests/backend/data_plane/append.rs New integration test validates ack/tail monotonicity and read ordering under pipelining across 32 sequential appends; thorough and well-structured.

Sequence Diagram

sequenceDiagram
    participant Client
    participant StreamerClient
    participant Streamer
    participant DB as SlateDB
    participant DurabilityNotifier

    Client->>StreamerClient: append_permit(input)
    StreamerClient->>StreamerClient: acquire_many(metered_size) on shared semaphore
    StreamerClient-->>Client: AppendPermit

    Client->>StreamerClient: permit.submit()
    StreamerClient->>Streamer: Message::Append (via msg_tx)
    Streamer->>Streamer: handle_append → sequence_records
    Streamer->>Streamer: push db_submit_append future → db_writes_pending

    Streamer->>DB: write_with_options(wb, await_durable=false)
    DB-->>Streamer: WriteHandle { seqnum: db_seq }
    Streamer->>Streamer: push InFlightAppend { db_seq, records } → inflight_appends
    Streamer->>DurabilityNotifier: subscribe(db_seq, callback)

    Note over Streamer: Multiple in-flight appends allowed here

    DB->>DurabilityNotifier: db.subscribe() watch fires (durable_seq advances)
    DurabilityNotifier->>DurabilityNotifier: notify_waiters → fire callbacks
    DurabilityNotifier->>Streamer: Message::DurabilityStatus(Ok(durable_seq)) (via msg_tx)

    Streamer->>Streamer: on_db_durable_seq_advanced(durable_seq)
    Streamer->>Streamer: pop inflight_appends → stable_pos, follow_tx.send, pending_appends.on_stable

    Streamer-->>Client: AppendAck (via oneshot reply_tx)
    StreamerClient->>StreamerClient: drop(sema_permit) — releases semaphore
Loading

Last reviewed commit: 7a8e490

shikhar and others added 5 commits March 4, 2026 20:25
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@shikhar shikhar merged commit f23d225 into main Mar 6, 2026
16 checks passed
@shikhar shikhar deleted the gh-48 branch March 6, 2026 01:20
@s2-release-plz s2-release-plz bot mentioned this pull request Mar 5, 2026
shikhar pushed a commit that referenced this pull request Mar 6, 2026
## 🤖 New release

* `s2-lite`: 0.29.19 -> 0.29.20 (✓ API compatible changes)
* `s2-sdk`: 0.24.6 -> 0.24.7 (✓ API compatible changes)
* `s2-cli`: 0.29.19 -> 0.29.20

<details><summary><i><b>Changelog</b></i></summary><p>

## `s2-lite`

<blockquote>

## [0.29.20] - 2026-03-06

### Features

- Default append pipelining with durability-gated acks
([#289](#289))

### Bug Fixes

- Keep follow sessions alive across dormancy
([#301](#301))

### Miscellaneous Tasks

- Dep updates ([#299](#299))

<!-- generated by git-cliff -->
</blockquote>

## `s2-sdk`

<blockquote>

## [0.24.7] - 2026-03-06

### Miscellaneous Tasks

- Dep updates ([#299](#299))

<!-- generated by git-cliff -->
</blockquote>

## `s2-cli`

<blockquote>

## [0.29.20] - 2026-03-06

### Bug Fixes

- Handle Ctrl+C during bench catchup
([#297](#297))

<!-- generated by git-cliff -->
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: s2-release-plz[bot] <262023388+s2-release-plz[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant