Skip to content

fix(runtimed): prevent pipe mode stream corruption by buffering outgoing frames (#613)#616

Merged
rgbkrk merged 2 commits intomainfrom
fix/stream-split
Mar 8, 2026
Merged

fix(runtimed): prevent pipe mode stream corruption by buffering outgoing frames (#613)#616
rgbkrk merged 2 commits intomainfrom
fix/stream-split

Conversation

@rgbkrk
Copy link
Member

@rgbkrk rgbkrk commented Mar 8, 2026

Fixes #613.

Bug

In pipe mode (#608), ReceiveFrontendSyncMessage wrote sync frames directly to client.stream inside the select! command handler. If the daemon was sending data simultaneously, select! would drop the pending socket read, then the write would corrupt the framing. The daemon reads payload bytes as a length prefix → frame too large: 1154398000 bytes.

Fix

Buffer outgoing pipe frames in a VecDeque and flush them at the top of the loop before entering select!. Writes only happen when no read is pending.

loop {
    // 1. Flush queued pipe frames (safe — no pending read)
    while let Some(frame) = pending_pipe_frames.pop_front() {
        send_typed_frame(&mut client.stream, AutomergeSync, &frame).await?;
    }

    // 2. Drain pending broadcasts
    // 3. select! { commands, socket read }
    //    - Commands queue frames instead of writing directly
    //    - Socket reads are uninterrupted
}

Full peer mode (runtimed-py) is unaffected — its writes go through sync_to_daemon() which owns the read/write sequence.

Test plan

  • cargo test -p runtimed --lib — 234 passed
  • cargo test -p runtimed --test '*' — 15 integration tests
  • cargo test -p notebook --lib — 126 passed
  • Run All on notebook with matplotlib output — outputs appear
  • Individual cell execution — still works
  • Session restore — no connection errors

In pipe mode, ReceiveFrontendSyncMessage was writing sync frames
directly to client.stream inside the select! command handler. If the
daemon was sending data at the same time, the select! would drop the
pending socket read future, then the command handler would write to the
stream, corrupting the framing. The daemon would then read payload bytes
as a length prefix, producing bogus frame sizes (observed: 1.15 GB).

Fix: buffer outgoing pipe frames in a VecDeque and flush them at the top
of the loop BEFORE entering select!. This ensures writes only happen when
no read is pending on the socket. The queue is drained synchronously
before the next select! iteration.

Full peer mode (runtimed-py) is unaffected — its writes go through
sync_to_daemon() which owns the read/write sequence.
@rgbkrk rgbkrk force-pushed the fix/stream-split branch from b8e8fbc to 20aeea7 Compare March 8, 2026 14:17
In pipe mode (#608), the Automerge sync path doesn't deliver output
changes — the daemon's sync state tracks the relay peer, not the WASM
peer, so all sync frames arrive with changed=false. materializeCells
never runs after execution, and outputs never render.

Re-enable the broadcast-driven output path (appendOutput via onOutput
callback). The broadcast pipeline works correctly — outputs arrive,
blob manifests resolve, and the external store updates.

No duplicate risk: since sync frames have changed=false,
materializeCells doesn't run after execution, so there's only one
source of output updates (broadcasts).

The proper fix is to align the sync states so the daemon talks
directly to the WASM through the pipe (skip do_initial_sync in
pipe mode). Tracked as a follow-up.
@rgbkrk rgbkrk force-pushed the fix/stream-split branch from 0c40de3 to a0a6332 Compare March 8, 2026 14:41
@rgbkrk rgbkrk merged commit 4cc7770 into main Mar 8, 2026
23 of 24 checks passed
@rgbkrk rgbkrk deleted the fix/stream-split branch March 8, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(runtimed): pipe mode stream corruption — relay writes to daemon socket during pending read

1 participant