-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Description
Description
When a subagent (Task tool) returns a large result (e.g., 50-100k tokens from search results or file reads), the compaction overflow check at finish-step doesn't trigger because it only sees the current step's reported token usage, not the accumulated context that will be sent in the next step.
This causes sessions to hit context limits unexpectedly when the model attempts to process the next step with the large tool output included.
Root Cause Analysis
Location: packages/opencode/src/session/processor.ts line 274
case "finish-step":
const usage = Session.getUsage({
model: input.model,
usage: value.usage, // ← Token count from AI provider for THIS step only
metadata: value.providerMetadata,
})
// ...
if (await SessionCompaction.isOverflow({ tokens: usage.tokens, model: input.model })) {
needsCompaction = true
}
The usage.tokens comes from the AI provider's reported usage for the current step. This does NOT include:
- The token cost of tool outputs that were just generated
- Content that will be added to the next step's messages
- The
isOverflowcalculation (compaction.ts:35):
const count = input.tokens.input + input.tokens.cache.read + input.tokens.output
This correctly sums the reported tokens, but the issue is the input to this function doesn't reflect the true upcoming context size.
Reproduction Scenario
Start a session with Claude (200k context)
- Ask a question that triggers the Task tool with an explore subagent
- The subagent performs multiple searches/reads and returns ~80k tokens of results
- The parent session's finish-step shows
usage.tokens.inputof only ~15k (the context before the tool -output) - Compaction check: 15k < 168k (usable context) → passes →
needsCompaction = false - Next step begins, sending the full conversation + 80k tool output
Either:
- Model receives truncated context (silent data loss)
- API error for exceeding limits
- Very slow processing as model struggles with near-limit context
Expected Behavior
Compaction should trigger before the context actually overflows, ideally:
After tool outputs are stored but before the next step begins
By estimating the upcoming context size including tool outputs
Suggested Fixes (by Opus)
Option A: Estimate context after tool completion
After tool-result processing (line ~193), estimate the new context size:
case "tool-result": {
// ... existing code ...
// After storing tool output, check if we're approaching overflow
const outputTokens = Token.estimate(value.output.output)
const estimatedNext = usage.tokens.input + usage.tokens.output + outputTokens
if (await SessionCompaction.isOverflow({
tokens: { ...usage.tokens, input: estimatedNext },
model: input.model
})) {
needsCompaction = true
}
break
}
Option B: Check overflow before continuing tool loop
In the main loop (around line 337), before continuing to the next step:
if (needsCompaction) break
// Also check estimated context for pending tool outputs
const pendingOutputTokens = Object.values(toolcalls)
.filter(t => t.state.status === "completed")
.reduce((sum, t) => sum + Token.estimate(t.state.output || ""), 0)
if (pendingOutputTokens > 0) {
const estimatedTokens = {
...input.assistantMessage.tokens,
input: input.assistantMessage.tokens.input + pendingOutputTokens
}
if (await SessionCompaction.isOverflow({ tokens: estimatedTokens, model: input.model })) {
needsCompaction = true
break
}
}
Option C: Proactive check in prompt.ts
The check in prompt.ts:502 runs between conversation turns but not during tool loops within a turn. Could add a similar check before each processor.process() call.
Additional Context
- The existing prune() function (compaction.ts:49) helps by removing old tool outputs, but it only runs after compaction is triggered
- The check in prompt.ts:502 only runs at the start of new conversation turns, not during multi-step tool execution within a single turn
- Models with smaller context windows (Claude Haiku 200k vs Sonnet 200k) may hit this more frequently
Subagents using explore type are particularly prone to this since they perform many read/search operations
Environment
OpenCode version: Latest main
Observed with: Opus, Codex
Typical trigger: Task tool with explore subagent doing codebase searches
Impact
- High for users with complex multi-step workflows
- Sessions can become unresponsive or lose context unexpectedly
- No warning before overflow occurs
Plugins
No response
OpenCode version
No response
Steps to reproduce
No response
Screenshot and/or share link
No response
Operating System
Ubuntu
Terminal
No response