You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The agent should own its context end-to-end. Today RLM is paper-spec one-shot (Algorithm 1 from Zhang/Kraska/Khattab arXiv:2512.24601, implemented in crates/tui/src/rlm/mod.rs:5-13): tool invoked, sandboxed Python REPL is born and dies, returns one synthesized string. Per the Mismanaged Geniuses Hypothesis (Zhang 2026), the leverage isn't the model — it's the management substrate around it.
Promote the REPL from a per-call sandbox to a persistent workshop: lives across turns, accumulates state, and exposes context-control primitives that let the agent build its own context piece by piece. Combined with cache discipline (#528-533), offload-by-default (sibling issue), reasoning-capture (#544), and the memory cluster (#534-539), the agent gets total control over what's resident, what's in the workshop, what's archived — and can branch back through any of it as needed.
Current behavior
crates/tui/src/rlm/mod.rs:21 already runs a long-lived python3 -u subprocess via single stdin/stdout pipe — but only within a single tool call. Subprocess dies when rlm returns.
crates/tui/src/rlm/bridge.rs exposes the seam: RpcDispatcher handles Llm, LlmBatch, Rlm, RlmBatch request types. Adding new helpers = add an RPC variant + Python wrapper + dispatch handler.
crates/tui/src/session_manager.rs:174-177 already has save_checkpoint writing full conversation state to ~/.deepseek/sessions/checkpoints/latest.json. Pattern exists; just doesn't include REPL state.
crates/tui/src/cycle_manager.rs:12-16 archives prior cycles to JSONL so future tooling can search. Pattern for "transient → archived" already established.
remember(item, type) — writes to memory from inside the workshop.
Phase C — Agent context-control primitives
The agent owns its context. New tools / REPL helpers:
promote_to_context(workshop_var | recall_hit | archive_id) — pull a workshop result, memory hit, or archived turn back into the parent's resident context.
evict_from_context(range) — drop a section that's no longer earning its place. No summarization — just drop.
branch_from(turn_id) — load a prior turn's reasoning trace + state into the workshop, work it from there.
pin(item) — mark resident; never evict on compaction.
archive(item) — move to cycle archive (#cycle_manager) without losing.
Phase D — Cross-session persistence
Extend SavedSession schema (session_manager.rs:102+) with a workshop_state field.
Serialize REPL globals (the serializable subset; non-serializable values are re-derived on resume from a recorded load_file / load_context log).
New tool: workshop_resume() — reloads globals + replays the load log on session restoration.
Workshop state checkpoints alongside conversation state — same save_checkpoint mechanism.
Workshop literally learns its own moves over time.
Cache discipline integration
Workshop work happens outside the parent's context — only synthesis returns. This protects the cache prefix from haystack pollution. Combined with #528-533 (cache-maximal context defaults), the agent has both: rich cached resident context for active work, plus unlimited workshop scratch for everything else.
Workshop loaded with a corpus on turn 1 is queryable on turn 5 without reloading.
Helper functions defined in turn 1 are callable in turn 5.
recall() from inside the workshop returns memory objects the model can .filter() / .map() over.
After a successful extraction pattern, the same task shape in a later session surfaces the prior pattern as a memory recall hit.
Measurable shift: fewer "load haystack into context" patterns, more workshop usage; cache hit rate stays high because the parent context isn't getting polluted.
The agent demonstrates context curation: pulls workshop results into context only when needed, evicts when done, branches from prior turns when revisiting.
Thesis
The agent should own its context end-to-end. Today RLM is paper-spec one-shot (Algorithm 1 from Zhang/Kraska/Khattab arXiv:2512.24601, implemented in
crates/tui/src/rlm/mod.rs:5-13): tool invoked, sandboxed Python REPL is born and dies, returns one synthesized string. Per the Mismanaged Geniuses Hypothesis (Zhang 2026), the leverage isn't the model — it's the management substrate around it.Promote the REPL from a per-call sandbox to a persistent workshop: lives across turns, accumulates state, and exposes context-control primitives that let the agent build its own context piece by piece. Combined with cache discipline (#528-533), offload-by-default (sibling issue), reasoning-capture (#544), and the memory cluster (#534-539), the agent gets total control over what's resident, what's in the workshop, what's archived — and can branch back through any of it as needed.
Current behavior
crates/tui/src/rlm/mod.rs:21already runs a long-livedpython3 -usubprocess via single stdin/stdout pipe — but only within a single tool call. Subprocess dies whenrlmreturns.crates/tui/src/rlm/bridge.rsexposes the seam:RpcDispatcherhandlesLlm,LlmBatch,Rlm,RlmBatchrequest types. Adding new helpers = add an RPC variant + Python wrapper + dispatch handler.crates/tui/src/session_manager.rs:174-177already hassave_checkpointwriting full conversation state to~/.deepseek/sessions/checkpoints/latest.json. Pattern exists; just doesn't include REPL state.crates/tui/src/cycle_manager.rs:12-16archives prior cycles to JSONL so future tooling can search. Pattern for "transient → archived" already established.crates/tui/src/rlm/prompt.rs:18-26):context/ctx,llm_query,llm_query_batched,rlm_query,rlm_query_batched,SHOW_VARS,FINAL,FINAL_VAR,print.Proposed change
Phase A — Persistent within session
RpcDispatcheralive for the session's lifetime.workshop_exec(code)— executes in the persistent REPL. Variables, helpers, loaded data persist across calls.workshop_inspect()— returns the current variable namespace and types from outside the REPL.rlmtool stays as a one-shot convenience wrapper for sub-agent dispatch.Phase B — Standard orchestration helpers
Add RPC variants + Python wrappers for:
verify_with(claim, evidence)— verifier sub-LLM, returns verified/falsified + reasoning.map_reduce(items, mapper, reducer)— first-class decomposition primitive.fanout_dedup(prompts)— parallel + dedup-on-equivalent-output.recall(query, scope)— calls into MemoryBackend (MEMORY: substrate — MemoryBackend trait + SQLite-backed store (FTS5) #535/MEMORY: engine — typed memory model + multi-signal recall #536); returns objects, not strings.remember(item, type)— writes to memory from inside the workshop.Phase C — Agent context-control primitives
The agent owns its context. New tools / REPL helpers:
promote_to_context(workshop_var | recall_hit | archive_id)— pull a workshop result, memory hit, or archived turn back into the parent's resident context.evict_from_context(range)— drop a section that's no longer earning its place. No summarization — just drop.branch_from(turn_id)— load a prior turn's reasoning trace + state into the workshop, work it from there.pin(item)— mark resident; never evict on compaction.archive(item)— move to cycle archive (#cycle_manager) without losing.Phase D — Cross-session persistence
SavedSessionschema (session_manager.rs:102+) with aworkshop_statefield.load_file/load_contextlog).workshop_resume()— reloads globals + replays the load log on session restoration.save_checkpointmechanism.Phase E — Pattern capture to memory
proceduralmemories (MEMORY: engine — typed memory model + multi-signal recall #536): "for ingesting whalescale receipts, this map_reduce shape worked." Recall surfaces them next time a similar task starts.Cache discipline integration
Workshop work happens outside the parent's context — only synthesis returns. This protects the cache prefix from haystack pollution. Combined with #528-533 (cache-maximal context defaults), the agent has both: rich cached resident context for active work, plus unlimited workshop scratch for everything else.
V4 specifics
temperature=1.0, top_p=1.0) — see V4: sampling defaults + thinking-mode parameter wiring #540 for the bridge.rs:108-109 fix that this issue depends on for correct sub-call behavior.deepseek-v4-flash(matches currenttools/rlm.rs:24). Cheap, fast, RL'd hard on tool use.Open questions / risks
workshop_reset()and auto-quarantine on persistent errors.recall/rememberare blocked on MEMORY: substrate — MemoryBackend trait + SQLite-backed store (FTS5) #535 + MEMORY: engine — typed memory model + multi-signal recall #536 landing.Acceptance signals
recall()from inside the workshop returns memory objects the model can.filter()/.map()over.