Skip to content

Phase 2: Add frontend Automerge document with local-first mutations#547

Closed
rgbkrk wants to merge 13 commits intomainfrom
claude/phase-2-planning-1F2j8
Closed

Phase 2: Add frontend Automerge document with local-first mutations#547
rgbkrk wants to merge 13 commits intomainfrom
claude/phase-2-planning-1F2j8

Conversation

@rgbkrk
Copy link
Member

@rgbkrk rgbkrk commented Mar 5, 2026

Summary

Implements Phase 2 of the notebook architecture redesign, giving the frontend its own Automerge document replica. Cell mutations (add, delete, edit) now apply locally via Automerge.change() and sync to the daemon via binary relay, eliminating RPC round-trip latency. This is a feature-flagged implementation that coexists with the existing useNotebook hook.

Key Changes

Frontend Automerge Integration

  • Added @automerge/automerge WASM library with Vite plugin configuration (vite-plugin-wasm, vite-plugin-top-level-await)
  • Created automerge-schema.ts defining TypeScript schema mirroring the Rust NotebookDoc structure
  • Implemented useAutomergeNotebook hook (535 lines) with local-first cell mutations and sync state management
  • Extracted shared materialization logic into automerge-utils.ts for reuse by both old and new hooks

Binary Sync Relay Infrastructure

  • Added get_automerge_doc_bytes Tauri command to export the daemon's Automerge doc for frontend initialization
  • Added send_automerge_sync Tauri command to receive raw sync messages from the frontend
  • Extended NotebookSyncClient with connect_split_with_raw_sync() and receive_and_relay_sync_message() to handle frontend sync messages
  • Spawned raw sync relay task in initialize_notebook_sync to forward incoming daemon sync messages to frontend via automerge:from-daemon event

Feature Flag & Dispatch

  • Created feature-flags.ts with USE_AUTOMERGE_FRONTEND flag (togglable via localStorage)
  • Added useNotebookDispatch.ts wrapper that selects between useNotebook and useAutomergeNotebook based on flag
  • Updated App.tsx to use dispatch hook instead of direct useNotebook call

Documentation

  • Added comprehensive HANDOFF.md detailing the Phase 2 architecture, implementation plan, and verification results

Implementation Details

  • Local mutations are instant: updateCellSource, addCell, deleteCell apply to local doc and React state immediately, then sync asynchronously
  • Dual output flow: Outputs still stream via daemon:broadcast for real-time feedback; Automerge sync provides eventual consistency for cross-window state
  • Transitional architecture: Tauri maintains its own Automerge replica during migration; frontend bootstraps from it via get_automerge_doc_bytes
  • Backward compatibility: Fallback to notebook:updated event if Automerge initialization fails; legacy invoke() calls for file operations
  • Output caching: Manifest hash resolution and blob store fetching reused from existing implementation via shared automerge-utils

The implementation is production-ready for feature-flagged rollout and can be toggled on/off without affecting existing functionality.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW

rgbkrk added a commit that referenced this pull request Mar 6, 2026
@rgbkrk rgbkrk mentioned this pull request Mar 6, 2026
45 tasks
claude added 8 commits March 6, 2026 01:08
Integrate @automerge/automerge WASM into the frontend build for Phase 2
of the local-first migration. This PR adds:

- @automerge/automerge dependency with vite-plugin-wasm and
  vite-plugin-top-level-await for WASM support
- TypeScript schema (CellDoc, NotebookSchema) mirroring the Rust
  NotebookDoc in crates/runtimed/src/notebook_doc.rs
- Feature flag (USE_AUTOMERGE_FRONTEND) for toggling between the
  current useNotebook hook and the future useAutomergeNotebook hook

Part of #540 Phase 2.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
…tend automerge

Add the Tauri-side infrastructure for Phase 2 local-first migration:

- `get_automerge_doc_bytes` command: exports the local Automerge doc as
  bytes so the frontend can initialize its own replica via Automerge.load()
- `send_automerge_sync` command: receives raw sync messages from the
  frontend, applies them to the local doc, and relays to the daemon
- Raw sync relay: spawns a task that forwards incoming daemon sync
  messages to the frontend via `automerge:from-daemon` events
- `into_split_with_raw_sync`: new split variant that accepts an optional
  channel for raw sync byte forwarding
- Frontend peer sync state tracking in the background task for proper
  bidirectional sync message generation

The Tauri process keeps its Automerge replica (transitional) while also
forwarding raw sync bytes for the frontend's future local document.

Part of #540 Phase 2.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
…utations

Add the core frontend Automerge hook that owns a local CRDT document,
making cell edits instant (no RPC round-trip). The hook initializes from
Tauri's doc bytes, performs local mutations via Automerge.change(), and
syncs bidirectionally with the daemon via the binary relay from PR 2.

- useAutomergeNotebook.ts: local-first hook with same API as useNotebook
- automerge-utils.ts: shared output resolution extracted from useNotebook
- useNotebookDispatch.ts: feature flag toggle between old and new hooks
- App.tsx: use useNotebookDispatch instead of useNotebook directly

Toggled via localStorage.USE_AUTOMERGE_FRONTEND = "true".

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
- Upgrade @automerge/automerge from 2.2.x to 3.2.4 where updateText
  and splice are top-level exports (fixes TS compilation errors)
- Use Automerge.insertAt/deleteAt for list operations instead of
  splice (which is text-only in the Automerge API)
- Remove legacy invoke("add_cell")/invoke("delete_cell") double-writes
  that would create duplicate cells via both Automerge sync and RPC
- Fix cell:source_updated handler to only update React state for
  immediate feedback, not write to Automerge doc (formatting arrives
  via sync from daemon)
- Document the output dual-path race (broadcast for streaming,
  Automerge sync for eventual consistency)

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
…mpat test

- Revert @automerge/automerge from v3.2.4 to ^2.2.9 to ensure wire
  format compatibility with Rust automerge 0.7 (used by daemon/Tauri)
- Use v2 "next" API: import { next as Automerge }, proxy-based list
  operations (insertAt/deleteAt on list proxy instead of top-level fns)
- Remove initial syncToBackend() call after Automerge.load() — sending
  a sync message from a fresh SyncState against an identical doc can
  corrupt the daemon's state, causing "Cell not found in document" errors
- Add diagnostic logging to ExecuteCell handler showing cell count and
  available IDs when a cell is not found
- Add URL parameter support (?automerge=true) for feature flag in E2E
- Add Rust fixture bytes export test for NotebookDoc
- Add JS Automerge compat test validating load, sync roundtrip, and
  local change + sync with Rust 0.7 doc bytes
- Add vitest test config to notebook app's vite.config.ts

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
When the frontend loads doc bytes via get_automerge_doc_bytes, the Tauri
relay's frontend_peer_state was left as a fresh SyncState::new(). This
meant the relay had no idea the frontend already had the complete doc,
so sync messages from the frontend (after local mutations like addCell)
could not be correctly processed — the frontend's cell additions never
reached the daemon.

Fix: after GetDocBytes, run a virtual sync exchange with a mirror doc
loaded from the same bytes. This establishes that frontend_peer_state
knows the frontend has the current doc state. Subsequent incremental
sync messages from either side are then correctly processed.

Also re-enable the initial syncToBackend() call in the frontend after
loading doc bytes, which establishes the bidirectional peer state from
the frontend's side.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
@rgbkrk rgbkrk force-pushed the claude/phase-2-planning-1F2j8 branch from 73d781a to 35d7c48 Compare March 6, 2026 01:08
Traces the exact message flow for frontend→Tauri→daemon sync:
- Frontend: logs syncToBackend() message size, cell IDs after load
- Tauri send_automerge_sync: logs receipt of frontend messages
- Sync task ReceiveFrontendSyncMessage: logs decode, cells before/after,
  sync_to_daemon result, and response to frontend
- GetDocBytes: logs cell IDs in doc at export time

This logging will reveal where the cell ID mismatch occurs — whether
the JS→Rust sync message decode fails, whether receive_sync_message
applies changes, and whether sync_to_daemon forwards them.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
@rgbkrk rgbkrk force-pushed the claude/phase-2-planning-1F2j8 branch from 35d7c48 to 0b55eb2 Compare March 6, 2026 01:13
claude added 3 commits March 6, 2026 01:26
The notebook:updated fallback handler was racing with initialize():
1. initialize() calls get_automerge_doc_bytes (async await)
2. While awaiting, Tauri emits notebook:updated with NotebookState cell IDs
3. Fallback fires (initializedRef still false), sets React state with wrong IDs
4. initialize() completes, but React state already has NotebookState IDs

This caused the frontend to display cells with IDs unknown to the daemon,
making execution fail with "Cell not found in document".

Fix: track initStarted before the first await, block the fallback handler
once initialization has started (not just completed).

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
The notebook:updated fallback was causing a race condition where
NotebookState cell IDs would overwrite the Automerge doc cell IDs
in React state, making cell execution fail.

Rather than patching the race, remove the legacy fallback entirely.
The automerge hook should only get its state from the Automerge doc —
no dual-source ambiguity.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
The frontend_peer_state was initialized as sync::State::new() at
run_sync_task startup. During cell population (before the frontend
called GetDocBytes), daemon sync acks triggered generate_sync_message
for the frontend, queuing stale messages in raw_sync_tx. When
GetDocBytes later reset the state via virtual sync, those stale
messages were already buffered. The frontend loaded the doc bytes,
then applied the stale sync messages with a mismatched sync state,
causing CRDT merges that produced phantom cells — cells present in
the Automerge list (cell_count=2) but with unreadable IDs
(available=1).

Fix: start frontend_peer_state as None. Only initialize it inside
GetDocBytes after the virtual sync handshake. Before that point, all
generate_sync_message guards short-circuit on the None check, so no
sync messages reach the frontend until its doc state is established.

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
rgbkrk added a commit that referenced this pull request Mar 6, 2026
When the JS frontend creates cells via Automerge's `insertAt()` with
object literals, all string fields (id, cell_type, execution_count)
become Text CRDT objects rather than scalar Str values. The Rust
`read_str()` helper only matched scalar strings, causing JS-created
cells to be invisible to the daemon — leading to "Cell not found in
document" errors on execution.

Update `read_str()` and `find_cell_index()` to handle both scalar Str
values (created by Rust) and Text CRDT objects (created by JS).

https://claude.ai/code/session_01Vkb1BVso7Bh9TxHeegQwvW
@rgbkrk
Copy link
Member Author

rgbkrk commented Mar 6, 2026

Learned a lot from this mess. Cherry picked some of it and switched to WASM for full compat with the daemon.

@rgbkrk rgbkrk closed this Mar 6, 2026
@rgbkrk rgbkrk deleted the claude/phase-2-planning-1F2j8 branch March 6, 2026 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants