RUNTIME: default-route large tool outputs through the workshop (protect parent context)

## Thesis
The parent model's context is precious — every token in it should earn its place. Today, large tool outputs (file reads, grep haystacks, web fetches, archive search hits, command output) get truncated to fit and dumped into the parent's context, polluting the cache prefix and burning the most expensive token budget on raw haystacks. Per MGH (Zhang 2026): decomposition is the leverage. Default-route any tool output above a size threshold through the workshop (#547). The workshop processes with V4-Flash, returns a synthesis. The parent only ever sees the answer.

Generalizes #511 (currently scoped to sub-agent output) to *any* tool output. Closes the loop with #547 — the workshop becomes the substrate, this is the policy.

## Current behavior
- `read_file` returns up to a hard cap; large files dumped raw into parent context.
- `grep_files` returns top-N hits; truncates with marker; raw text into parent.
- `recall_archive` BM25 over JSONL cycles; raw hits into parent.
- `web_search` / `fetch_url` results dumped raw.
- `exec_shell` output capped; raw into parent.
- `agent_*` sub-agent outputs sometimes summarized (#511 in flight), sometimes not.

Result: a single haystack tool call can blow the cache prefix and cost more than the actual reasoning work.

## Proposed change
- **Default policy:** any tool output > N tokens (initial: 4K) routes through the workshop instead of returning raw to the parent.
- **Mechanism:** tool result is loaded as a workshop variable (e.g. `last_tool_result`); a synthesis subagent (V4-Flash, cheap) processes it against the original tool intent; only the synthesis returns to the parent.
- **Override:** the model can request raw output via a tool parameter (`raw=true`) when it knows it needs the full thing in resident context.
- **Promotion:** the workshop variable persists; the model can later call `promote_to_context(last_tool_result)` (per Phase C of the workshop issue) to pull the full thing back if synthesis missed something.
- **Per-tool tunable threshold:** some tools are more haystack-shaped than others. `read_file` threshold higher; `grep_files` threshold lower.

## Cache discipline
This is the cache-protection mechanism. Without it, #541 (cache-maximal context default) is fragile — one bad grep blows the prefix. With it, the parent's cache prefix stays stable; haystack work happens off to the side.

## Open questions / risks
- **Threshold tuning:** 4K is a starting estimate. Too low → workshop overhead on small results. Too high → cache pollution on medium results.
- **Synthesis quality:** Flash-summarized tool output may miss details. Mitigated by promotion path + `raw=true` opt-out, but worth measuring on real tasks.
- **Latency:** workshop synthesis adds an LLM call per large tool output. Cheap (Flash) but real. May want streaming synthesis.
- **Tool authorship:** new tools need to declare a default threshold or inherit a sensible one.

## Acceptance signals
- A `read_file` on a 100KB file returns a synthesis to the parent, with the full content available in the workshop for promotion.
- `grep_files` returning 200 hits returns a ranked summary, not raw matches.
- Cache hit rate on long V4 sessions stays high even when many large tool calls happen.
- Override via `raw=true` works and bypasses the workshop.
- Token cost per session for haystack-heavy work drops measurably (Flash synthesis cheaper than Pro reading raw).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUNTIME: default-route large tool outputs through the workshop (protect parent context) #548

Thesis

Current behavior

Proposed change

Cache discipline

Open questions / risks

Acceptance signals

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RUNTIME: default-route large tool outputs through the workshop (protect parent context) #548

Description

Thesis

Current behavior

Proposed change

Cache discipline

Open questions / risks

Acceptance signals

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions