Preflight Checklist
Problem Statement
Tool results consume around 60% of context tokens in agentic Claude Code sessions. I've audited sessions across diverse workloads (coordinator agents, SDK-spawned bots, interactive CLI, subagent swarms) — every session had >49% tool result ratio, worst hit 73.6%.
Bash is the core problem. The same command produces wildly different output sizes depending on runtime state:
Command Best Case Worst Case Ratio
──────────────────── ────────── ────────── ─────
git status 5 tok 5,491 tok 1098x
git diff 20 tok 6,211 tok 311x
tail -100 <logfile> 200 tok 6,215 tok 31x
curl <api-endpoint> 50 tok 4,000 tok 80x
npm install 30 tok 3,500 tok 117x
docker logs 10 tok 5,000 tok 500x
PreToolUse can't solve this. For Read, the hook checks file size and injects limit/offset - input predicts output. For Bash, the hook sees the command string but is blind to output size. Pattern-matching known-verbose commands (git status → git status -s) becomes a whack-a-mole game that risks losing critical info (pre-commit errors, push rejections) and can never cover the long tail.
Claude Code's built-in BASH_MAX_OUTPUT_LENGTH (30K chars) is a blunt head+tail cap - still 10-25x more than semantic compression could achieve, and completely blind to signal vs noise.
The fundamental gap: PreToolUse can gate what goes in but is blind to what comes out. For Bash, input does not predict output. The only clean solution is post-execution result modification.
Proposed Solution
Extend updatedMCPToolOutput to work for all tools (not just MCP), or add a parallel updatedToolOutput field in PostToolUse hook output:
{
"hookSpecificOutput": {
"hookEventName": "PostToolUse",
"updatedToolOutput": "<compressed result string>"
}
}
When present, Claude Code replaces the tool result in conversation context with this value instead of the original tool_response. The original result is still available to the hook via stdin.
Built-in tool results follow known schemas. A hook using updatedToolOutput is responsible for preserving schema invariants - the same responsibility that updatedMCPToolOutput hooks bear for MCP tools. Tool-side state tracking (Read's file-read history, Write's success status) occurs at execution time, before the result enters context, so replacing the result string does not affect internal bookkeeping.
Implementation scope: The hook input already contains tool_response. The replacement mechanism already exists for MCP tools (updatedMCPToolOutput). The change extends that code path to built-in tools.
Alternative Solutions
I've exhausted every available mechanism:
| Approach |
Result |
PreToolUse updatedInput |
Works for Read (inject limit/offset). Fails for Bash - can't predict output size from command string. |
PostToolUse additionalContext |
Adds tokens alongside the original result. Makes pollution worse. |
PostToolUse updatedMCPToolOutput |
Exactly the mechanism needed - but only works for MCP tools. |
| CLAUDE.md prompt discipline |
Probabilistic - Claude follows ~70% of the time. |
BASH_MAX_OUTPUT_LENGTH |
30K char blunt cap. Still 10-25x too large, semantically blind. |
CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS |
Global cap. Can't adapt per-call. |
| RTK (third-party PreToolUse rewriter) |
Returns permissionDecision: "allow" on every rewritten command, bypassing Claude Code's permission system and safety hooks (rtk-ai/rtk#260). Incompatible with production security guardrails. |
Priority
High - Significant impact on productivity
Feature Category
Configuration and settings
Use Case Example
Scenario: I run a coordinator agent that orchestrates subagents via Claude Code SDK.
- Agent runs
git status on a repo with 200+ untracked files → 5,491 tokens dumped into context
- Agent only needed to know "branch is clean, 203 untracked files" → ~50 tokens of actual signal
- A PostToolUse hook would see the full result, compress it to a summary, and return ~200 tokens
- Savings: 94% per call. Across a session with 257 Bash calls, this recovers ~35% of total context budget
Another scenario:
- Agent runs
tail -100 app.log during debugging → 6,215 tokens of log output
- A PostToolUse hook keeps error lines + last 10 lines → ~500 tokens
- Without this feature, I can't even know the output will be large until after execution — PreToolUse is blind to it
This cannot be solved via PreToolUse. The hook doesn't know the output size until the command runs.
Example PostToolUse compression hook
#!/bin/bash
INPUT=$(cat)
TOOL=$(echo "$INPUT" | jq -r '.tool_name')
[ "$TOOL" != "Bash" ] && exit 0
RESPONSE=$(echo "$INPUT" | jq -r '.tool_response // empty')
TOKEN_EST=$(echo -n "$RESPONSE" | wc -c | awk '{print int($1/4)}')
[ "$TOKEN_EST" -lt 1000 ] && exit 0
TOTAL_LINES=$(echo "$RESPONSE" | wc -l)
HEAD=$(echo "$RESPONSE" | head -30)
TAIL=$(echo "$RESPONSE" | tail -10)
COMPRESSED=$(printf "%s\n\n... [%d lines, ~%d tokens — truncated] ...\n\n%s" \
"$HEAD" "$TOTAL_LINES" "$TOKEN_EST" "$TAIL")
jq -n --arg content "$COMPRESSED" \
'{hookSpecificOutput: {hookEventName: "PostToolUse", updatedToolOutput: $content}}'
Additional Context
Prior Art: updatedMCPToolOutput already exists and works for MCP tools (confirmed in docs and via #24788). The replacement mechanism is implemented - this request extends it to built-in tools.
Prior requests (all closed without resolution):
Audit data: 8 sessions, 603 tool calls, 626K total tokens. Tool results averaged ~60% of context. Worst offenders: Bash outliers (5-6K tokens per call), Read full-file dumps (2K avg), Task subagent result blobs (5.9K avg). Estimated 82% of tool result tokens are compactable.
Prompt caching benefit: Deterministic compression hooks produce stable, shorter results - improving prompt cache hit rates across turns.
Preflight Checklist
Problem Statement
Tool results consume around 60% of context tokens in agentic Claude Code sessions. I've audited sessions across diverse workloads (coordinator agents, SDK-spawned bots, interactive CLI, subagent swarms) — every session had >49% tool result ratio, worst hit 73.6%.
Bash is the core problem. The same command produces wildly different output sizes depending on runtime state:
PreToolUse can't solve this. For Read, the hook checks file size and injects
limit/offset- input predicts output. For Bash, the hook sees the command string but is blind to output size. Pattern-matching known-verbose commands (git status→git status -s) becomes a whack-a-mole game that risks losing critical info (pre-commit errors, push rejections) and can never cover the long tail.Claude Code's built-in
BASH_MAX_OUTPUT_LENGTH(30K chars) is a blunt head+tail cap - still 10-25x more than semantic compression could achieve, and completely blind to signal vs noise.The fundamental gap: PreToolUse can gate what goes in but is blind to what comes out. For Bash, input does not predict output. The only clean solution is post-execution result modification.
Proposed Solution
Extend
updatedMCPToolOutputto work for all tools (not just MCP), or add a parallelupdatedToolOutputfield in PostToolUse hook output:{ "hookSpecificOutput": { "hookEventName": "PostToolUse", "updatedToolOutput": "<compressed result string>" } }When present, Claude Code replaces the tool result in conversation context with this value instead of the original
tool_response. The original result is still available to the hook via stdin.Built-in tool results follow known schemas. A hook using
updatedToolOutputis responsible for preserving schema invariants - the same responsibility thatupdatedMCPToolOutputhooks bear for MCP tools. Tool-side state tracking (Read's file-read history, Write's success status) occurs at execution time, before the result enters context, so replacing the result string does not affect internal bookkeeping.Implementation scope: The hook input already contains
tool_response. The replacement mechanism already exists for MCP tools (updatedMCPToolOutput). The change extends that code path to built-in tools.Alternative Solutions
I've exhausted every available mechanism:
updatedInputadditionalContextupdatedMCPToolOutputBASH_MAX_OUTPUT_LENGTHCLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENSpermissionDecision: "allow"on every rewritten command, bypassing Claude Code's permission system and safety hooks (rtk-ai/rtk#260). Incompatible with production security guardrails.Priority
High - Significant impact on productivity
Feature Category
Configuration and settings
Use Case Example
Scenario: I run a coordinator agent that orchestrates subagents via Claude Code SDK.
git statuson a repo with 200+ untracked files → 5,491 tokens dumped into contextAnother scenario:
tail -100 app.logduring debugging → 6,215 tokens of log outputThis cannot be solved via PreToolUse. The hook doesn't know the output size until the command runs.
Example PostToolUse compression hook
Additional Context
Prior Art:
updatedMCPToolOutputalready exists and works for MCP tools (confirmed in docs and via #24788). The replacement mechanism is implemented - this request extends it to built-in tools.Prior requests (all closed without resolution):
Audit data: 8 sessions, 603 tool calls, 626K total tokens. Tool results averaged ~60% of context. Worst offenders: Bash outliers (5-6K tokens per call), Read full-file dumps (2K avg), Task subagent result blobs (5.9K avg). Estimated 82% of tool result tokens are compactable.
Prompt caching benefit: Deterministic compression hooks produce stable, shorter results - improving prompt cache hit rates across turns.