MEMORY/RUNTIME: persistent agent workshop — total agent control over context

## Thesis
The agent should own its context end-to-end. Today RLM is paper-spec one-shot (Algorithm 1 from Zhang/Kraska/Khattab arXiv:2512.24601, implemented in `crates/tui/src/rlm/mod.rs:5-13`): tool invoked, sandboxed Python REPL is born and dies, returns one synthesized string. Per the Mismanaged Geniuses Hypothesis (Zhang 2026), the leverage isn't the model — it's the management substrate around it.

Promote the REPL from a per-call sandbox to a *persistent workshop*: lives across turns, accumulates state, and exposes context-control primitives that let the agent build its own context piece by piece. Combined with cache discipline (#528-533), offload-by-default (sibling issue), reasoning-capture (#544), and the memory cluster (#534-539), the agent gets total control over what's resident, what's in the workshop, what's archived — and can branch back through any of it as needed.

## Current behavior
- `crates/tui/src/rlm/mod.rs:21` already runs a long-lived `python3 -u` subprocess via single stdin/stdout pipe — but only within a single tool call. Subprocess dies when `rlm` returns.
- `crates/tui/src/rlm/bridge.rs` exposes the seam: `RpcDispatcher` handles `Llm`, `LlmBatch`, `Rlm`, `RlmBatch` request types. Adding new helpers = add an RPC variant + Python wrapper + dispatch handler.
- `crates/tui/src/session_manager.rs:174-177` already has `save_checkpoint` writing full conversation state to `~/.deepseek/sessions/checkpoints/latest.json`. Pattern exists; just doesn't include REPL state.
- `crates/tui/src/cycle_manager.rs:12-16` archives prior cycles to JSONL so future tooling can search. Pattern for "transient → archived" already established.
- Existing REPL helpers (per `crates/tui/src/rlm/prompt.rs:18-26`): `context`/`ctx`, `llm_query`, `llm_query_batched`, `rlm_query`, `rlm_query_batched`, `SHOW_VARS`, `FINAL`, `FINAL_VAR`, `print`.

## Proposed change

### Phase A — Persistent within session
- Don't kill the python3 subprocess between turns. Keep `RpcDispatcher` alive for the session's lifetime.
- New always-loaded tool: `workshop_exec(code)` — executes in the persistent REPL. Variables, helpers, loaded data persist across calls.
- New tool: `workshop_inspect()` — returns the current variable namespace and types from outside the REPL.
- The legacy `rlm` tool stays as a one-shot convenience wrapper for sub-agent dispatch.

### Phase B — Standard orchestration helpers
Add RPC variants + Python wrappers for:
- `verify_with(claim, evidence)` — verifier sub-LLM, returns verified/falsified + reasoning.
- `map_reduce(items, mapper, reducer)` — first-class decomposition primitive.
- `fanout_dedup(prompts)` — parallel + dedup-on-equivalent-output.
- `recall(query, scope)` — calls into MemoryBackend (#535/#536); returns objects, not strings.
- `remember(item, type)` — writes to memory from inside the workshop.

### Phase C — Agent context-control primitives
The agent owns its context. New tools / REPL helpers:
- `promote_to_context(workshop_var | recall_hit | archive_id)` — pull a workshop result, memory hit, or archived turn back into the parent's resident context.
- `evict_from_context(range)` — drop a section that's no longer earning its place. No summarization — just drop.
- `branch_from(turn_id)` — load a prior turn's reasoning trace + state into the workshop, work it from there.
- `pin(item)` — mark resident; never evict on compaction.
- `archive(item)` — move to cycle archive (#cycle_manager) without losing.

### Phase D — Cross-session persistence
- Extend `SavedSession` schema (`session_manager.rs:102+`) with a `workshop_state` field.
- Serialize REPL globals (the serializable subset; non-serializable values are re-derived on resume from a recorded `load_file` / `load_context` log).
- New tool: `workshop_resume()` — reloads globals + replays the load log on session restoration.
- Workshop state checkpoints alongside conversation state — same `save_checkpoint` mechanism.

### Phase E — Pattern capture to memory
- When a workshop program completes successfully, the auto-extraction subagent (#538) classifies the program shape.
- Successful patterns become `procedural` memories (#536): "for ingesting whalescale receipts, this map_reduce shape worked." Recall surfaces them next time a similar task starts.
- Workshop literally learns its own moves over time.

## Cache discipline integration
Workshop work happens *outside* the parent's context — only synthesis returns. This protects the cache prefix from haystack pollution. Combined with #528-533 (cache-maximal context defaults), the agent has both: rich cached resident context for active work, plus unlimited workshop scratch for everything else.

## V4 specifics
- Sub-LLM calls in the workshop should use V4 sampling (`temperature=1.0, top_p=1.0`) — see #540 for the bridge.rs:108-109 fix that this issue depends on for correct sub-call behavior.
- Workshop default child model: `deepseek-v4-flash` (matches current `tools/rlm.rs:24`). Cheap, fast, RL'd hard on tool use.

## Open questions / risks
- **Sandbox + persistence:** REPL globals on disk are persistent attack surface. Need `workshop_reset()` and auto-quarantine on persistent errors.
- **Cost accounting:** long-lived workshop accumulates LLM calls across turns. Extend usage tracking from per-rlm-call to session-scoped REPL.
- **Discoverability:** how does the model learn to use the workshop instead of doing things in its own head? Engine hints when it loads >100KB inline (#548) + system-prompt guidance + recall surfacing prior workshop patterns.
- **Memory backend dependency:** Phase B `recall`/`remember` are blocked on #535 + #536 landing.

## Acceptance signals
- Workshop loaded with a corpus on turn 1 is queryable on turn 5 without reloading.
- Helper functions defined in turn 1 are callable in turn 5.
- `recall()` from inside the workshop returns memory objects the model can `.filter()` / `.map()` over.
- After a successful extraction pattern, the same task shape in a later session surfaces the prior pattern as a memory recall hit.
- Measurable shift: fewer "load haystack into context" patterns, more workshop usage; cache hit rate stays high because the parent context isn't getting polluted.
- The agent demonstrates context curation: pulls workshop results into context only when needed, evicts when done, branches from prior turns when revisiting.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MEMORY/RUNTIME: persistent agent workshop — total agent control over context #547

Thesis

Current behavior

Proposed change

Phase A — Persistent within session

Phase B — Standard orchestration helpers

Phase C — Agent context-control primitives

Phase D — Cross-session persistence

Phase E — Pattern capture to memory

Cache discipline integration

V4 specifics

Open questions / risks

Acceptance signals

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MEMORY/RUNTIME: persistent agent workshop — total agent control over context #547

Description

Thesis

Current behavior

Proposed change

Phase A — Persistent within session

Phase B — Standard orchestration helpers

Phase C — Agent context-control primitives

Phase D — Cross-session persistence

Phase E — Pattern capture to memory

Cache discipline integration

V4 specifics

Open questions / risks

Acceptance signals

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions