Skip to content

fix(lite): keep follow sessions alive across dormancy#301

Merged
shikhar merged 10 commits intomainfrom
codex/fix-streamer-follow-dormancy
Mar 6, 2026
Merged

fix(lite): keep follow sessions alive across dormancy#301
shikhar merged 10 commits intomainfrom
codex/fix-streamer-follow-dormancy

Conversation

@shikhar
Copy link
Member

@shikhar shikhar commented Mar 6, 2026

Summary

  • track active follow subscribers inside the streamer actor so queued follow requests cannot lose to dormancy
  • prioritize pending work over dormancy in the streamer select loop
  • add a paused-time regression that advances past DORMANT_TIMEOUT and proves an unbounded follow session survives idle periods

Testing

  • just fmt
  • cargo test -p s2-lite

Closes #300

Keep streamer dormancy from tearing down unbounded live-read followers while a follow subscription is active.

Wrap follow receivers with a drop guard so the streamer wakes up when the last follower disconnects, and add a regression test for the dormant follower path.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR fixes a bug where a lite streamer would exit due to the dormancy timeout while active follow subscribers were still connected. It introduces a FollowGuard RAII type and FollowReceiver wrapper so the run-loop can gate the dormancy branch on active_followers == 0, and a FollowerDropped wake-up message to re-arm the timer as soon as the last subscriber disconnects.

Key changes:

  • Added AtomicUsize active_followers on Streamer and Arc-shared into each FollowGuard
  • FollowGuard::new increments the counter; FollowGuard::drop decrements it and sends Message::FollowerDropped when it reaches zero
  • FollowReceiver wraps broadcast::Receiver with a FollowGuard, so the lifecycle of the guard is tied to the subscriber lifetime
  • The tokio::select! dormancy branch gains an if active_followers == 0 precondition, preventing idle shutdown while followers are alive
  • run is refactored into run_with_dormant_timeout to accept an injectable timeout, enabling the new deterministic regression test
  • Two style-level suggestions were left: (1) the regression test uses real wall-clock sleep delays instead of tokio's paused-time facilities, which could be flaky under CI load; (2) Ordering::Relaxed on the atomic operations is technically sound due to the happens-before chain provided by the mpsc channel, but Acquire/Release would be more self-documenting

Confidence Score: 4/5

  • Safe to merge; the fix is logically correct with only minor style concerns around test timing and memory ordering semantics.
  • The core logic is sound: the RAII guard pattern cleanly tracks follower lifetime, the FollowerDropped wake-up correctly re-arms the dormancy timer, and the regression test validates the intended behavior. The only concerns are non-blocking style suggestions (real-time test delays and Relaxed ordering that works but is implicit). No logic errors or race conditions were found.
  • No files require special attention; both comments are style suggestions on lite/src/backend/streamer.rs.

Important Files Changed

Filename Overview
lite/src/backend/streamer.rs Adds active_followers (AtomicUsize), FollowGuard RAII type, and FollowReceiver wrapper to keep streamers alive while followers are connected; two style-level suggestions (test timing, memory ordering) but no logic bugs.

Sequence Diagram

sequenceDiagram
    participant C as Client (read.rs)
    participant SC as StreamerClient
    participant S as Streamer (run loop)
    participant FG as FollowGuard

    C->>SC: follow(start_seq_num)
    SC->>S: Message::Follow
    S->>FG: FollowGuard::new() [active_followers += 1]
    S-->>SC: Ok(FollowReceiver { _guard, rx })
    SC-->>C: Ok(FollowReceiver)

    note over S: dormancy select branch<br/>disabled (active_followers != 0)

    C->>C: follow_rx.recv() [streaming records ...]

    C->>FG: drop(FollowReceiver)
    FG->>FG: fetch_sub → active_followers = 0
    FG->>S: Message::FollowerDropped (wake-up)
    note over FG: guard dropped

    S->>S: receive FollowerDropped (no-op handler)
    S->>S: loop top: load active_followers == 0
    S->>S: reset dormancy timer (dormant_timeout)
    note over S: dormancy branch now enabled
    S->>S: dormancy fires → break (streamer exits)
Loading

Last reviewed commit: 6bac2e8

shikhar added 9 commits March 5, 2026 20:35
Add a paused-time regression that exercises Backend::read through the full follow path and verifies an unbounded session survives streamer dormancy before receiving new data.
Remove the actor-level dormancy test and its helper scaffolding now that the paused-time Backend::read regression covers the real user-visible failure mode.
Drop the now-unused run_with_dormant_timeout helper after moving dormancy coverage to the integration-level follow test.
Move the unbounded follow dormancy regression into backend::read tests so it can advance past streamer::DORMANT_TIMEOUT directly instead of hardcoding 61 seconds, and drop the duplicated integration test.
Use explicit acquire/release semantics for the follow subscriber counter so the synchronization does not rely on the mpsc wake-up path being inferred.
Drop the separate follow subscriber counter and decide dormancy expiry directly from follow_tx.receiver_count() when the timeout fires.
Serialize follow subscription lifetime through the streamer actor so dormancy checks cannot race queued follow requests. Increment the follower count when processing Follow, decrement it on a FollowerDropped message sent from the receiver drop guard.
Bias the streamer select loop so pending write completions and inbound messages are handled before the dormancy timeout branch. This prevents queued follow requests from losing to dormancy once follower lifetime is tracked inside the actor.
Yield once after advancing past DORMANT_TIMEOUT in the paused-time read regression so the streamer can take the dormancy branch before the follow-up append arrives. This makes the test fail reliably without the fix.
@shikhar shikhar changed the title fix(lite): keep active follow streamers alive fix(lite): keep follow sessions alive across dormancy Mar 6, 2026
@shikhar shikhar merged commit df66927 into main Mar 6, 2026
17 checks passed
@shikhar shikhar deleted the codex/fix-streamer-follow-dormancy branch March 6, 2026 02:22
@s2-release-plz s2-release-plz bot mentioned this pull request Mar 6, 2026
shikhar pushed a commit that referenced this pull request Mar 6, 2026
## 🤖 New release

* `s2-lite`: 0.29.19 -> 0.29.20 (✓ API compatible changes)
* `s2-sdk`: 0.24.6 -> 0.24.7 (✓ API compatible changes)
* `s2-cli`: 0.29.19 -> 0.29.20

<details><summary><i><b>Changelog</b></i></summary><p>

## `s2-lite`

<blockquote>

## [0.29.20] - 2026-03-06

### Features

- Default append pipelining with durability-gated acks
([#289](#289))

### Bug Fixes

- Keep follow sessions alive across dormancy
([#301](#301))

### Miscellaneous Tasks

- Dep updates ([#299](#299))

<!-- generated by git-cliff -->
</blockquote>

## `s2-sdk`

<blockquote>

## [0.24.7] - 2026-03-06

### Miscellaneous Tasks

- Dep updates ([#299](#299))

<!-- generated by git-cliff -->
</blockquote>

## `s2-cli`

<blockquote>

## [0.29.20] - 2026-03-06

### Bug Fixes

- Handle Ctrl+C during bench catchup
([#297](#297))

<!-- generated by git-cliff -->
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: s2-release-plz[bot] <262023388+s2-release-plz[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Streamer dormancy timeout kills active read session followers

1 participant