Skip to content

Compaction overflow check doesn't account for large tool outputs until the next step, causing context overflow #10634

@maxsonderby

Description

@maxsonderby

Description

When a subagent (Task tool) returns a large result (e.g., 50-100k tokens from search results or file reads), the compaction overflow check at finish-step doesn't trigger because it only sees the current step's reported token usage, not the accumulated context that will be sent in the next step.

This causes sessions to hit context limits unexpectedly when the model attempts to process the next step with the large tool output included.

Root Cause Analysis

Location: packages/opencode/src/session/processor.ts line 274

case "finish-step":
  const usage = Session.getUsage({
    model: input.model,
    usage: value.usage,         // ← Token count from AI provider for THIS step only
    metadata: value.providerMetadata,
  })
  // ...
  if (await SessionCompaction.isOverflow({ tokens: usage.tokens, model: input.model })) {
    needsCompaction = true
  }

The usage.tokens comes from the AI provider's reported usage for the current step. This does NOT include:

  • The token cost of tool outputs that were just generated
  • Content that will be added to the next step's messages
  • The isOverflow calculation (compaction.ts:35):

const count = input.tokens.input + input.tokens.cache.read + input.tokens.output
This correctly sums the reported tokens, but the issue is the input to this function doesn't reflect the true upcoming context size.

Reproduction Scenario

Start a session with Claude (200k context)

  • Ask a question that triggers the Task tool with an explore subagent
  • The subagent performs multiple searches/reads and returns ~80k tokens of results
  • The parent session's finish-step shows usage.tokens.input of only ~15k (the context before the tool -output)
  • Compaction check: 15k < 168k (usable context) → passes → needsCompaction = false
  • Next step begins, sending the full conversation + 80k tool output

Either:

  • Model receives truncated context (silent data loss)
  • API error for exceeding limits
  • Very slow processing as model struggles with near-limit context

Expected Behavior

Compaction should trigger before the context actually overflows, ideally:

After tool outputs are stored but before the next step begins
By estimating the upcoming context size including tool outputs

Suggested Fixes (by Opus)

Option A: Estimate context after tool completion

After tool-result processing (line ~193), estimate the new context size:

case "tool-result": {
  // ... existing code ...
  
  // After storing tool output, check if we're approaching overflow
  const outputTokens = Token.estimate(value.output.output)
  const estimatedNext = usage.tokens.input + usage.tokens.output + outputTokens
  if (await SessionCompaction.isOverflow({ 
    tokens: { ...usage.tokens, input: estimatedNext }, 
    model: input.model 
  })) {
    needsCompaction = true
  }
  break
}

Option B: Check overflow before continuing tool loop

In the main loop (around line 337), before continuing to the next step:

if (needsCompaction) break

// Also check estimated context for pending tool outputs
const pendingOutputTokens = Object.values(toolcalls)
  .filter(t => t.state.status === "completed")
  .reduce((sum, t) => sum + Token.estimate(t.state.output || ""), 0)

if (pendingOutputTokens > 0) {
  const estimatedTokens = {
    ...input.assistantMessage.tokens,
    input: input.assistantMessage.tokens.input + pendingOutputTokens
  }
  if (await SessionCompaction.isOverflow({ tokens: estimatedTokens, model: input.model })) {
    needsCompaction = true
    break
  }
}

Option C: Proactive check in prompt.ts

The check in prompt.ts:502 runs between conversation turns but not during tool loops within a turn. Could add a similar check before each processor.process() call.

Additional Context

  • The existing prune() function (compaction.ts:49) helps by removing old tool outputs, but it only runs after compaction is triggered
  • The check in prompt.ts:502 only runs at the start of new conversation turns, not during multi-step tool execution within a single turn
  • Models with smaller context windows (Claude Haiku 200k vs Sonnet 200k) may hit this more frequently
    Subagents using explore type are particularly prone to this since they perform many read/search operations
    Environment

OpenCode version: Latest main
Observed with: Opus, Codex
Typical trigger: Task tool with explore subagent doing codebase searches

Impact

  • High for users with complex multi-step workflows
  • Sessions can become unresponsive or lose context unexpectedly
  • No warning before overflow occurs

Plugins

No response

OpenCode version

No response

Steps to reproduce

No response

Screenshot and/or share link

No response

Operating System

Ubuntu

Terminal

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingperfIndicates a performance issue or need for optimization

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions