Skip to content

Local-First Automerge Migration Plan #540

@rgbkrk

Description

@rgbkrk

Local-First Automerge Migration Plan

Migrate the notebook from RPC-with-optimistic-UI to true local-first Automerge ownership.

Status

Phase Status Notes
0: Optimistic mutations ✅ Done PR #542 merged
1.1–1.3: Eliminate NotebookState dual-write ✅ Done PR #544 merged
1.4: Delegate save-to-disk to daemon ✅ Done PR #545 merged
2.0-pre: runtimed-wasm bindings ✅ Done PR #552 merged
2: Frontend Automerge doc ✅ Done PR #553 merged — useAutomergeNotebook via runtimed-wasm, old useNotebook removed
2D: Dead code removal ✅ Done PR #554 merged — dead commands removed, notebook:updated emissions cleaned up
3: Authority boundary hardening ⬜ Not started Formalize writer roles per field
4: Optimize Tauri sync relay ⬜ Not started Binary IPC, reduce overhead

🎉 The migration is complete. The notebook is local-first. Cells mutate instantly in WASM, sync flows to the daemon via Automerge, execution works, widgets work. Everything below is post-migration cleanup and future improvements.


Problem

The current architecture maintains four copies of the notebook:

React useState ←→ invoke() ←→ NotebookState ←→ SyncClient AutoCommit ←→ Daemon AutoCommit
     (1)                          (2)                  (3)                     (4)
  1. React cells stateuseState<NotebookCell[]> in useNotebook.ts:286
  2. NotebookStatenbformat::v4::Notebook struct in crates/notebook/src/notebook_state.rs:152
  3. Sync client's AutoCommit — local Automerge replica in NotebookSyncClient (crates/runtimed/src/notebook_sync_client.rs:433)
  4. Daemon's NotebookDoc — canonical Automerge doc in NotebookRoom (crates/runtimed/src/notebook_sync_server.rs:452)

Cell mutations like addCell and deleteCell use blocking RPC — the frontend awaits invoke(), which round-trips through copies (2), (3), and (4) before the frontend updates copy (1). The user pays full IPC + sync latency for operations that don't need backend involvement.

What's already right

  • Outputs flow through Automerge, not RPC. Frontend onOutput is intentionally () => {} (App.tsx:271-276). Outputs arrive via notebook:updated Automerge sync.
  • Daemon reads source from Automerge for execution. ExecuteCell reads from the doc (notebook_sync_server.rs:1850), not from whatever the frontend sent.
  • Source uses Automerge Text CRDT. Character-level merging via update_text() (Myers diff).
  • Python agents are full Automerge peers. The runtimed Python package (crates/runtimed-py) connects via the same Unix socket, holds its own AutoCommit doc, and syncs bidirectionally. Agents can create_cell, set_source, append_source, delete_cell — all as CRDT operations that merge with concurrent user edits.

What's wrong

  • NotebookState is redundant. It shadows the sync client's Automerge doc. The receiver loop at lib.rs:420-424 overwrites it on every peer update.
  • Frontend has no Automerge doc. It receives materialized CellSnapshot[] and can't do local CRDT mutations.
  • addCell and deleteCell block on RPC for things the frontend can determine locally (UUID generation, "last cell" validation).
  • Full-state replacement on sync. notebook:updated does setCells(newCells), clobbering any in-flight optimistic state, which forces the blocking RPC pattern.

Peers to account for

Peer Connection Automerge doc? Writes
Frontend (React) Tauri IPC ❌ No — receives CellSnapshot[] Cells, source, metadata via invoke()
Tauri process In-process ✅ Via NotebookSyncClient Relays frontend writes
Daemon (runtimed) Unix socket server ✅ Canonical NotebookDoc Outputs, execution_count, kernel status
Python agent (runtimed package) Unix socket client ✅ Own AutoCommit Cells, source (create/edit/delete)
Additional windows Unix socket client ✅ Via NotebookSyncClient Same as frontend

Python agents are first-class peers. An agent calling session.append_source(cell_id, text) while a user types in the same cell produces concurrent Text CRDT operations that Automerge merges automatically. The migration must preserve this — agents should never need to change their API.


Target architecture

┌──────────────────────────────┐
│  Frontend                    │
│  Local AutoCommit doc        │
│  React state derived from doc│
│  All cell CRUD is local      │
└──────────┬───────────────────┘
           │ Binary Automerge sync messages
           │ (Tauri events or custom protocol)
┌──────────▼───────────────────┐
│  Tauri Process (thin relay)  │
│  Forwards sync messages      │
│  Handles: save, format, OS   │
└──────────┬───────────────────┘
           │ Unix socket (v2 typed frames)
┌──────────▼───────────────────┐
│  Daemon                      │
│  Canonical AutoCommit doc    │
│  Writes: outputs, exec_count │
│  Kernel, env management      │
└──────────┬───────────────────┘
           │ Unix socket (v2 typed frames)
┌──────────▼───────────────────┐
│  Python Agent                │
│  Own AutoCommit doc          │
│  Writes: cells, source       │
│  Reads: outputs, status      │
└──────────────────────────────┘

All peers hold Automerge docs. Mutations are local-first. Sync is automatic and bidirectional.


Phase 0: Make existing mutations optimistic ✅

Goal: Eliminate user-visible latency for addCell and deleteCell without touching the sync layer.

Effort: Small (days). Risk: Low — notebook:updated full-replace acts as convergence.

0.1 — Optimistic deleteCell

Current (useNotebook.ts:477-485):

const deleteCell = useCallback(async (cellId: string) => {
  try {
    await invoke("delete_cell", { cellId });
    setCells((prev) => prev.filter((c) => c.id !== cellId));
    setDirty(true);
  } catch (e) { ... }
}, []);

Change to:

const deleteCell = useCallback((cellId: string) => {
  // Validate locally — don't delete the last cell
  setCells((prev) => {
    if (prev.length <= 1) return prev;
    return prev.filter((c) => c.id !== cellId);
  });
  setDirty(true);
  // Fire-and-forget sync to backend
  invoke("delete_cell", { cellId }).catch((e) =>
    logger.error("[notebook] delete_cell sync failed:", e)
  );
}, []);
  • Move "last cell" check to frontend
  • setCells() runs before invoke()
  • invoke() is fire-and-forget (catch-only)
  • Backend delete_cell command becomes infallible from frontend's perspective

0.2 — Optimistic addCell

Current (useNotebook.ts:451-475):

const addCell = useCallback(async (cellType, afterCellId?) => {
  const newCell = await invoke<NotebookCell>("add_cell", { cellType, afterCellId });
  setCells((prev) => { /* insert newCell */ });
  ...
}, []);

Change to:

const addCell = useCallback((cellType, afterCellId?) => {
  const cellId = crypto.randomUUID();
  const newCell: NotebookCell = {
    id: cellId,
    cell_type: cellType,
    source: "",
    outputs: [],
    execution_count: null,
  };
  setCells((prev) => { /* insert newCell at position */ });
  setFocusedCellId(cellId);
  setDirty(true);
  // Fire-and-forget — backend uses provided cellId
  invoke("add_cell", { cellId, cellType, afterCellId }).catch((e) =>
    logger.error("[notebook] add_cell sync failed:", e)
  );
  return newCell;
}, []);
  • Generate UUID on frontend via crypto.randomUUID()
  • Construct NotebookCell locally
  • setCells() runs before invoke()
  • Update add_cell Tauri command signature to accept cell_id: String parameter
  • Backend uses provided ID instead of generating one

0.3 — Verify convergence

  • Confirm notebook:updated event still reconciles any divergence
  • Test: add cell while daemon is disconnected → cell persists after reconnect
  • Test: delete cell while daemon is disconnected → deletion syncs on reconnect
  • Test: two windows delete the same cell concurrently → both converge

Phase 1.1–1.3: Eliminate NotebookState dual-write ✅

Goal: Remove the redundant nbformat::Notebook struct from the Tauri process as a dual-write target. The NotebookSyncHandle becomes the single source of truth for all cell and metadata operations.

Effort: Medium (1-2 weeks). Risk: Medium — many Tauri commands reference NotebookState.

What was done

All ~25 call sites in crates/notebook/src/lib.rs that dual-wrote to both NotebookState and the sync handle were migrated:

  • Cell mutations (update_cell_source, add_cell, delete_cell) — sync handle only, no more NotebookState write.
  • Cell reads (load_notebook) — reads from handle.get_cells(), falls back to NotebookState when daemon disconnected.
  • Path reads (has_notebook_path, get_notebook_path, detect_pyproject, detect_pixi_toml, detect_environment_yml, detect_deno_config) — read from new context.path field.
  • All 16 metadata commands — read/write via sync handle get_metadata/set_metadata. Read commands fall back to NotebookState when disconnected.
  • Format cell — reads source from sync handle, writes formatted result back.
  • Receiver loop sync-back removed — the block that overwrote NotebookState cells/metadata from Automerge on every peer update is gone.
  • WindowNotebookContext gained path: Arc<Mutex<Option<PathBuf>>> and working_dir: Option<PathBuf>. notebook_state retained for save (Phase 1.4).

Additional fixes discovered during QA

  • Trust signature round-tripapprove_notebook_trust/verify_notebook_trust now use raw JSON read/write (get_raw_metadata_additional, set_raw_trust_in_metadata) to preserve trust_signature/trust_timestamp which aren't modeled in the typed RuntMetadata struct.
  • Runtime detectionget_runtime_from_sync now matches NotebookState::get_runtime() semantics: ks.language == "python" check and language_info fallback added.
  • Initial metadata in handshake — New initial_metadata field on Handshake::NotebookSync sends the kernelspec with the connection handshake so the daemon has it before auto-launching. Fixes Deno notebooks getting Python kernels on File → New Notebook As → Deno.
  • save_notebook_as stale cells — Refreshes NotebookState from Automerge before serializing (same pattern as save_notebook's local fallback).
  • push_metadata_to_sync removed — Was clobbering dependency changes by pushing stale NotebookState metadata to the sync handle on save.
  • add_cell with disconnected daemon — Now returns Err("Not connected to daemon") instead of a ghost cell.

Remaining NotebookState usage (Phase 1.4 scope)

Consumer Why it still uses NotebookState
save_notebook Serializes to .ipynb (refreshes from Automerge first)
save_notebook_as Same, plus updates path
clone_notebook_to_path Clones notebook struct
initialize_notebook_sync First-peer population reads cells from NotebookState (loaded from disk)
reconnect_to_daemon Passes NotebookState to initialize_notebook_sync
Disconnected fallbacks ~8 metadata read commands fall back to NotebookState when daemon is down
  • Route all cell reads through handle.get_cells() instead of state.lock()
  • Route all cell writes through sync handle commands
  • Route all metadata reads/writes through sync handle
  • Remove receiver loop sync-back
  • Add path and working_dir to WindowNotebookContext

Phase 1.4: Delegate save-to-disk to daemon ✅

Goal: Move notebook save-to-disk from the Tauri process to the daemon, eliminating the last major NotebookState consumer.

What was done

  • SaveNotebook request now accepts path: Option<String> for save-as support and returns NotebookSaved { path: String } with the daemon-normalized absolute path (.ipynb appended if needed).
  • save_notebook_to_disk() refactored to accept optional target path. Relative paths are rejected (daemon CWD is unpredictable as a launchd service).
  • Format-on-save moved to daemon. New format_notebook_cells() function in notebook_sync_server.rs runs ruff (Python) or deno fmt (Deno) on all code cells, writes formatted source back to Automerge, and broadcasts changes to all peers. Client-side formatting loop removed from both save_notebook and save_notebook_as.
  • Local NotebookState::serialize() fallback removed. Daemon save is now required — disconnected daemon returns a clear error.
  • save_notebook_as uses the daemon-returned path as canonical for window title, NotebookState.path, context.path, and room reconnection.
  • Python bindings: session.save(path=None) and await async_session.save(path=None) added to both Session and AsyncSession.

Known issue: save-as loses outputs

When saving an untitled notebook via Save As, outputs from the current session are lost in the saved file. This happens because save-as creates a new room (new notebook_id derived from the new path), and the daemon's save_notebook_to_disk merges with the existing file (which doesn't exist yet for a new path). The outputs exist in the old room's Automerge doc but not the new one. This may be acceptable since save-as triggers a new kernel anyway, but worth evaluating.

Remaining NotebookState usage

NotebookState is no longer used for persistence. Remaining 13 call sites are:

  • Initial loadingcreate_window_context, initialize_notebook_sync (first-peer population from disk)
  • Path/dirty trackingsave_notebook (path check), save_notebook_as (path + dirty update)
  • Cloneclone_notebook_to_path (local serialization for "Make a Copy")
  • Reconnectreconnect_to_daemon (passes state to re-initialize sync)
  • Disconnected fallbacks — 8 metadata read commands fall back to NotebookState when daemon is down

Full removal of the NotebookState struct is deferred — it still serves as the disconnected-daemon fallback and the initial notebook loading path. These will naturally go away as Phase 2 gives the frontend its own Automerge doc.

  • Add NotebookRequest::SaveToDisk variant to the protocol
  • Handle in handle_notebook_request: call existing save_notebook_to_disk, return success/failure
  • Frontend's save command sends request to daemon via sync handle
  • Move format-on-save (ruff/deno fmt) to the daemon save path
  • For save_notebook_as: frontend handles file dialog, sends new path to daemon, daemon writes, Tauri uses daemon-returned path
  • Remove NotebookState serialization from crates/notebook/src/lib.rs
  • Python bindings for save(path=None)

Phase 2.0-pre: runtimed-wasm bindings ✅

Goal: Ship the WASM bindings that solve the string→Text CRDT type mismatch. Pure additive — no changes to existing code.

  • crates/runtimed-wasm crate with NotebookHandle wasm-bindgen exports
  • wasm-pack build output at apps/notebook/src/wasm/runtimed-wasm/ (345KB gzip)
  • 18 Rust unit tests
  • 15 Deno smoke tests (cell CRUD, sync, concurrent merges, Text CRDT merge)
  • CI: wasm-pack build + Deno test step in build-linux
  • biome configured to exclude generated WASM output
  • CI green, merged to main

Phase 2: Frontend Automerge doc — PR #553

Draft PR #547 was abandoned — it accumulated layers of fixes for the JS string→Text mismatch we didn't understand yet. PR #553 is a clean rewrite using runtimed-wasm.

Goal: The frontend owns a local Automerge document. All document mutations happen instantly on the local doc. React state is derived from it. The Tauri process becomes a sync relay.

Effort: Medium (1-2 weeks). Risk: Low — the WASM eliminates the type mismatch that blocked us. Widgets confirmed working.

Strategy: Feature flag toggle. Build useAutomergeNotebook alongside useNotebook, controlled by localStorage or ?automerge=true URL param. Fresh hook using NotebookHandle from runtimed-wasm — no @automerge/automerge JS dependency.

Status: ✅ Complete. useAutomergeNotebook is the only path. useNotebook.ts and useNotebookDispatch.ts removed. Widgets working.

End state vs pragmatic step: The WASM is the pragmatic unblock. The destination is the frontend owning a proper JS Automerge doc with presence, cursors, and ecosystem plugins (@automerge/codemirror). Once the WASM path is stable and the Tauri relay is simplified, switching back to @automerge/automerge JS with ImmutableString for non-text fields is a well-scoped change. We'll be off the broken path and into the paved one.

Hard-won lessons (from QA and debugging)

Lesson Detail
🔴 JS Automerge string→Text CRDT mismatch is the root cause of phantom cells When JS does d.cells.push({ id: "cell-1", ... }) inside Automerge.change(), ALL string fields become Object(Text) CRDTs. But Rust NotebookDoc::add_cell() creates id, cell_type, execution_count as scalar Str (via doc.put()) and only source as ObjType::Text. The Rust read_str() helper sees Object(Text) where it expects ScalarValue::Str and returns None. The cell IS in the doc, sync worked, but get_cells() can't read it. This is not a bug in Automerge — it's a JS API design choice. The fix: use runtimed-wasm which calls the same Rust NotebookDoc code, so all field types match.
Sync needs multiple roundtrips The Automerge sync protocol is not one-shot. generateSyncMessage / receiveSyncMessage must be called in a loop until both sides return null messages.
Frontend peer state must not exist before GetDocBytes If frontend_peer_state is initialized at task startup, daemon sync acks during cell population buffer stale messages for a peer that doesn't exist yet. Fix: start as None, only init inside GetDocBytes.
@automerge/automerge JS gotchas (for future reference if we return to pure JS) v2 requires import { next as Automerge } for updateText/splice. List mutations use proxy methods (d.cells as any).insertAt(). Scalar strings return as RawString objects (use String() for comparison). Non-text fields need new Automerge.ImmutableString() to avoid the Text CRDT issue.

What to reuse from PR #547

The Rust-side relay infrastructure is sound and reusable. The JS hook needs a clean rewrite.

Keep (from PR #547's Rust changes):

  • raw_sync_tx / connect_split_with_raw_sync() — daemon sync message forwarding
  • get_automerge_doc_bytes command — exports Tauri-side doc as bytes
  • send_automerge_sync command — receives frontend sync messages
  • frontend_peer_state — separate sync state for the frontend peer (deferred init on GetDocBytes)
  • automerge:from-daemon Tauri event — relay to frontend

Rewrite (fresh hook using runtimed-wasm) ✅:

  • useAutomergeNotebook — loads NotebookHandle from WASM, not @automerge/automerge
  • Cell mutations: handle.add_cell(), handle.delete_cell(), handle.update_source() — no Automerge.change(), no proxy methods, no RawString
  • Sync: handle.generate_sync_message() / handle.receive_sync_message() — same relay, different WASM
  • Materialization: handle.get_cells_json() → parse → cellSnapshotsToNotebookCells() → React state
  • Shared CellSnapshot type extracted to materialize-cells.ts (used by both old and new hooks during migration)
  • Widgets working — ipywidgets comm messages flow correctly through the new path

Dropped (bandaid ripped):

  • useNotebook.ts — deleted, useAutomergeNotebook is the only hook
  • useNotebookDispatch.ts — deleted, no more feature flag toggle
  • @automerge/automerge npm dependency — no longer needed
  • All RawString / ImmutableString / proxy method workarounds

✅ Resolved: Phantom cell bug — root cause: JS string→Text CRDT mismatch

Symptom: Frontend loads 1 cell from doc bytes. After daemon sync, a phantom cell appears that doesn't exist in any Rust-side doc. Execution fails with "Cell not found."

Root cause: When JS Automerge creates cells via d.cells.push({ id: "cell-1", ... }) inside Automerge.change(), ALL string fields become Object(Text) CRDTs. But Rust NotebookDoc::add_cell() creates id, cell_type, execution_count as scalar Str. The Rust read_str() helper sees Object(Text) where it expects ScalarValue::Str → returns None. The cell IS in the doc but can't be read.

Not a bug in Automerge — it's a JS API design choice. The fix: runtimed-wasm uses the same Rust code, so field types match.

What we ruled out along the way: race conditions, stale peer state, JS v3 vs v2, initial syncToBackend corruption, sync message decode failures, version mismatch. All were symptoms of the type mismatch.

Approach String types Result
Rust↔Rust (Python bindings) Both scalar Str ✅ Works
JS↔JS (compat test) Both Text ✅ Works (consistent)
JS↔Rust (frontend↔relay) JS Text, Rust expects Str ❌ Phantom cells
WASM↔Rust (Spike C) Both scalar Str ✅ Works

Spikes:

  • A (Python bindings) ✅ — confirmed relay architecture is sound
  • B (Deno FFI) — skipped
  • C (Custom WASM) ✅crates/runtimed-wasm built, 18 Rust + 15 Deno tests pass
  • D (Tauri test window) — planned as verification step
  • E (Minimal repro) — confirmed the string→Text root cause

Sub-PR 2D — Post-migration cleanup — PR #554

Done in PR #554:

Still to do (next PR):

Task Priority Effort Notes
Remove NotebookState cell mutation methods 🟡 Short-term Small update_cell_source, add_cell, delete_cell, find_cell_index on the struct — no longer called
Remove cell_snapshot_to_nbformat helper 🟡 Short-term Small Only used by the removed notebook:updated emission path — verify no other callers
Reduce NotebookSyncClient to pure relay 🟠 Medium-term Medium Remove its local AutoCommit doc — it only needs to forward frames between daemon and frontend
Remove NotebookState struct entirely 🟠 Medium-term Medium After relay simplification — save_notebook and session restore need alternative paths
Extract shared notebook-doc crate 🟠 Medium-term Medium Eliminate WASM copy drift — both runtimed and runtimed-wasm depend on one source

Output streaming: onOutput stays no-opped

onOutput: () => {} — wiring appendOutput causes duplicate outputs. Broadcast appends an output, then Automerge sync replaces all cells with doc state that already has the output. Without a stable output ID to deduplicate, both appear.

Future fix: Add a sequence number or content hash to broadcast output messages so the frontend can skip outputs already present in Automerge state. This is a new feature, not a migration task.

Build pipeline note

crates/runtimed-wasm/src/notebook_doc.rs is currently a copy of crates/runtimed/src/notebook_doc.rs with daemon-only methods removed. When notebook_doc.rs changes (rare — usually daemon-side output operations), the WASM copy must be updated and wasm-pack build re-run. Future cleanup: extract a shared notebook-doc crate that both depend on.

Future improvements (post-migration)

Item Category Notes
Output dedup with broadcast IDs Performance Fix onOutput latency without duplicates
splice_source WASM export Editor integration Character-level CodeMirror edits instead of full-string update_source
@automerge/codemirror evaluation Editor integration Needs either JS Automerge with ImmutableString or custom CM extension calling our WASM splice
Presence / collaborative cursors Collaboration Separate spike — evaluate @automerge/automerge JS with ImmutableString for non-text fields
Profile materialization on large notebooks Performance get_cells_json() + parse on 100 / 500 / 1000 cells
Selective re-materialization Performance Only re-read changed cells instead of full get_cells_json()

Phase 3: Authority boundary hardening

Goal: Formalize which fields each peer can write, so CRDT convergence matches user expectations.

Effort: Small (days). Risk: Low.

Writer roles

Field Writer(s) Rationale
cells list (add/delete/reorder) Frontend, Python agents User/agent intent, instant feedback
cells[i].source (Text CRDT) Frontend, Python agents User types, agent streams code
cells[i].cell_type Frontend, Python agents Toggle code↔markdown
cells[i].outputs Daemon only Kernel produces these
cells[i].execution_count Daemon only Kernel assigns these
metadata.notebook_metadata Frontend, Python agents Dependency management
metadata.runtime Frontend User selects runtime

Conflict scenarios

Scenario Resolution
Two frontends add cell at same position Automerge list CRDT — both cells appear, deterministic order
User and agent edit same cell source concurrently Automerge Text CRDT — character-level merge
Two windows delete same cell Automerge — double-delete is idempotent
Agent appends source while user edits beginning Text CRDT merges cleanly — append is position-independent
Daemon writes outputs while frontend edits source Different fields — no conflict

clear_outputs ownership

Today the frontend clears outputs locally (clearCellOutputs in useNotebook.ts:441) AND the daemon clears via Automerge. In the new model:

  • Daemon owns output clearing — it clears when execution starts
  • Frontend stops rendering stale outputs when it sees execution_started broadcast
  • Remove frontend-side clearCellOutputs mutation on the Automerge doc
  • Frontend can show a visual indicator (dimming) while waiting for new outputs

Phase 4: Optimize Tauri sync relay (optional, future)

Goal: Make the Tauri relay as thin and fast as possible. Tauri stays in the path — the frontend should never speak directly to the Unix socket for security reasons.

Effort: Small. Risk: Low.

Why Tauri stays in the loop

The daemon Unix socket is unauthenticated — any process that can reach it can read/write any notebook. In production this is fine because only local processes connect. But letting the webview open a raw socket (even localhost WebSocket) would let a compromised renderer talk to the daemon without Tauri's mediation. Tauri should own the connection lifecycle: auth, handshake, connection teardown on window close.

Binary IPC precedent

We already pass binary data through Tauri events for widget buffers (useDaemonKernel.ts:316-321) — number[][] round-tripped via JSON. This works but is not ideal for Automerge sync messages which are opaque binary blobs that don't benefit from JSON encoding.

Optimization options

Approach Tradeoff
Tauri Channel<Vec<u8>> Tauri v2 channels support streaming binary data. Use a Rust-side channel to push sync messages to the frontend as raw bytes, and a command to send bytes back. Avoids JSON base64/array encoding overhead.
Tauri custom protocol Register an automerge:// protocol handler. Frontend fetch()es sync messages as binary responses. Good for large payloads but adds HTTP framing overhead for small, frequent sync messages.
Base64 in events Simplest. Encode sync messages as base64 strings in Tauri events. ~33% overhead but Automerge sync messages are typically small (sub-KB for incremental changes). Good enough to start.

Recommendation: start with base64-in-events in Phase 2 (simplest, unblocks everything), then migrate to Channel<Vec<u8>> if profiling shows encoding overhead matters.

The blob store already has its own HTTP server (http://127.0.0.1:{blobPort}/blob/{hash}) for large output payloads, so sync messages only carry manifest hashes, not raw output data. This keeps sync message sizes small.

  • Benchmark sync message sizes in practice (expect sub-KB for source edits, larger for cell add with initial content)
  • If base64 overhead is measurable: migrate to Tauri Channel<Vec<u8>> for binary streaming
  • Profile end-to-end latency: frontend Automerge change → Tauri relay → daemon apply → broadcast → other peer receives

Key files reference

Frontend

File Role
apps/notebook/src/hooks/useNotebook.ts Current notebook state hook (to be replaced)
apps/notebook/src/hooks/useDaemonKernel.ts Daemon kernel execution, broadcasts
apps/notebook/src/App.tsx Top-level component, wires hooks together

Tauri process (crates/notebook)

File Role
crates/notebook/src/lib.rs Tauri commands, sync initialization
crates/notebook/src/notebook_state.rs NotebookState — the struct to eliminate

Daemon (crates/runtimed)

File Role
crates/runtimed/src/notebook_doc.rs NotebookDoc — Automerge doc wrapper, schema
crates/runtimed/src/notebook_sync_client.rs NotebookSyncClient — local Automerge peer
crates/runtimed/src/notebook_sync_server.rs NotebookRoom, daemon-side sync loop
crates/runtimed/src/connection.rs v2 typed frame protocol

Python agent

File Role
crates/runtimed-py/src/session.rs Session — Python API, full Automerge peer
crates/runtimed-py/src/async_session.rs AsyncSession — async variant
python/runtimed/src/runtimed/_mcp_server.py MCP server — AI agent tools

Non-goals

  • Frontend direct access to the daemon socket. The daemon's Unix socket is unauthenticated — any process that connects can read/write any notebook room. Tauri must mediate all daemon communication so the webview renderer never holds a raw socket. This is a security boundary, not a performance optimization to remove later.
  • Replacing the daemon's Automerge with automerge-repo. The daemon's custom sync protocol (v2 typed frames multiplexing sync + requests + broadcasts) is well-suited to the single-doc-per-room model. automerge-repo is designed for multi-doc repos with discovery — unnecessary complexity here.
  • Moving Automerge to the Python agent. The Python agent already has a full Automerge peer via the Rust NotebookSyncClient. No JS Automerge needed on the Python side.
  • Real-time collaborative cursors. Desirable but separate concern. Can be implemented later via Automerge ephemeral messages or the existing broadcast channel.
  • Operational transform for CodeMirror. The @automerge/codemirror plugin exists but is a Phase 2+ consideration. Initial implementation can use updateText on every change event.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions