Environment
- OpenClaw version: 2026.4.15 (also reproduced on 2026.4.11)
- Platform: macOS (Mac Studio, native install)
- Auth mode: Anthropic OAuth (mode: "token")
- Main agent model: anthropic/claude-opus-4-6
- Subagent / heartbeat model: anthropic/claude-sonnet-4-5
- Compaction model: anthropic/claude-sonnet-4-5
- Interface: Discord (channels/threads as sessions)
Relevant config
agents.defaults.contextTokens: 150000 (also tested at 170000, 200000)
compaction.mode: "default"
compaction.model: "anthropic/claude-sonnet-4-5"
compaction.reserveTokens: 50000
compaction.reserveTokensFloor: 30000
compaction.memoryFlush.softThresholdTokens: 15000
contextPruning.mode: "cache-ttl" (aggressive, 30min TTL on tool results)
Summary
The preemptive auto-compaction check (shouldPreemptivelyCompactBeforePrompt) exhibited two issues with main-agent Opus sessions:
- At the default 200K budget, it never triggered preemptive compaction. Opus sessions grew to 167K+ tokens with zero compactions, requiring manual intervention or hitting actual context overflow (timeout-compaction).
- After lowering
contextTokens to 150K, preemptive compaction eventually fired — confirming the mechanism works but the threshold interaction with cache-ttl pruning and large bootstrap loads creates a window where the estimate stays artificially low.
A critical observability gap compounds the issue: shouldPreemptivelyCompactBeforePrompt emits no log when route === "fits", making it impossible to distinguish "precheck ran and decided it fits" from "precheck never ran."
Observed behavior
- Opus main-agent session climbed to 167K+ tokens (budget 200K) with zero compactions. Manual compaction required.
- Log audit of
gateway.log (full history): prior to the fix, every auto-compaction event was for Sonnet. Every Opus compaction event was timeout-compaction (overflow recovery).
- On Apr 13, an Opus session hit context overflow and threw 15 consecutive "prompt too large" errors in rapid succession without recovery.
- After lowering
agents.defaults.contextTokens to 150K and starting a fresh session: auto-compaction eventually fired at ~10:33 EDT on Apr 18, confirmed in gateway.log: auto-compaction succeeded for anthropic/claude-opus-4-6; retrying prompt
- The session
/status initially showed "Compactions: 1" which was a carried-forward artifact from a prior session summary, not a new event. Real compaction count reached 2 after the auto-compaction fired.
Expected behavior
Preemptive compaction check should fire reliably regardless of model, and the decision path should be observable in logs.
Source trace / hypothesis
Traced the call site in pi-embedded-runner:
- Preemptive check invoked at approx line 6654/6655
- Default context engine is
LegacyContextEngine (no ownsCompaction property) → bypass does NOT apply
contextTokenBudget defaults to 2e5 (200K) if not explicitly passed
shouldPreemptivelyCompactBeforePrompt uses estimatePrePromptTokens (character-based: estimatedChars / 4) compared against contextTokenBudget - reserveTokens
Leading hypothesis: contextPruning.mode: "cache-ttl" aggressively removes tool-result content from messages before estimatePrePromptTokens runs. The character-based estimate sees the pruned message set and stays well below the budget threshold, so the preemptive check returns route: "fits" indefinitely at the default 200K budget. At the reduced 150K budget, the margin is tight enough that the estimate eventually catches up.
This explains why Sonnet sessions compact correctly (different token accumulation pattern, more frequent tool calls, hit threshold between prune cycles) while Opus sessions at 200K do not (larger headroom, pruning consistently masks the estimate).
Pruning-disabled test pending to confirm this hypothesis.
Observability gap
The [context-overflow-precheck] log string is emitted ONLY when shouldCompact === true or when truncation routes fire. There is no log when route === "fits" — meaning the decision to NOT compact is completely silent. This made the bug extremely difficult to diagnose.
Request: Add a debug-level log inside shouldPreemptivelyCompactBeforePrompt (or at the call site) showing estimatedPromptTokens, promptBudgetBeforeReserve, overflowTokens, and route on every evaluation. This would make token budget issues trivially diagnosable.
Additional observations
Minimal reproduction
- Main agent model set to
anthropic/claude-opus-4-6
contextPruning.mode: "cache-ttl" enabled (30min TTL)
- Large bootstrap files (AGENTS.md, MEMORY.md etc.) totaling ~70K tokens at session start
contextTokens at default (200K) or high value
- Run a long session with periodic tool use; let context accumulate
- Observe: no
auto-compaction log entry at default budget; session requires manual compaction or hits overflow
Willing to provide
- Config (redacted)
gateway.log excerpts showing Sonnet auto-compactions and Opus timeout-compactions
- Source-trace notes
Environment
Relevant config
Summary
The preemptive auto-compaction check (
shouldPreemptivelyCompactBeforePrompt) exhibited two issues with main-agent Opus sessions:contextTokensto 150K, preemptive compaction eventually fired — confirming the mechanism works but the threshold interaction with cache-ttl pruning and large bootstrap loads creates a window where the estimate stays artificially low.A critical observability gap compounds the issue:
shouldPreemptivelyCompactBeforePromptemits no log whenroute === "fits", making it impossible to distinguish "precheck ran and decided it fits" from "precheck never ran."Observed behavior
gateway.log(full history): prior to the fix, everyauto-compactionevent was for Sonnet. Every Opus compaction event wastimeout-compaction(overflow recovery).agents.defaults.contextTokensto 150K and starting a fresh session: auto-compaction eventually fired at ~10:33 EDT on Apr 18, confirmed in gateway.log:auto-compaction succeeded for anthropic/claude-opus-4-6; retrying prompt/statusinitially showed "Compactions: 1" which was a carried-forward artifact from a prior session summary, not a new event. Real compaction count reached 2 after the auto-compaction fired.Expected behavior
Preemptive compaction check should fire reliably regardless of model, and the decision path should be observable in logs.
Source trace / hypothesis
Traced the call site in
pi-embedded-runner:LegacyContextEngine(noownsCompactionproperty) → bypass does NOT applycontextTokenBudgetdefaults to2e5(200K) if not explicitly passedshouldPreemptivelyCompactBeforePromptusesestimatePrePromptTokens(character-based:estimatedChars / 4) compared againstcontextTokenBudget - reserveTokensLeading hypothesis:
contextPruning.mode: "cache-ttl"aggressively removes tool-result content from messages beforeestimatePrePromptTokensruns. The character-based estimate sees the pruned message set and stays well below the budget threshold, so the preemptive check returnsroute: "fits"indefinitely at the default 200K budget. At the reduced 150K budget, the margin is tight enough that the estimate eventually catches up.This explains why Sonnet sessions compact correctly (different token accumulation pattern, more frequent tool calls, hit threshold between prune cycles) while Opus sessions at 200K do not (larger headroom, pruning consistently masks the estimate).
Pruning-disabled test pending to confirm this hypothesis.
Observability gap
The
[context-overflow-precheck]log string is emitted ONLY whenshouldCompact === trueor when truncation routes fire. There is no log whenroute === "fits"— meaning the decision to NOT compact is completely silent. This made the bug extremely difficult to diagnose.Request: Add a debug-level log inside
shouldPreemptivelyCompactBeforePrompt(or at the call site) showingestimatedPromptTokens,promptBudgetBeforeReserve,overflowTokens, androuteon every evaluation. This would make token budget issues trivially diagnosable.Additional observations
contextTokensmid-session have no effect; a fresh session is required..resetfiles appearing at ~3-4 AM daily for large sessions.Minimal reproduction
anthropic/claude-opus-4-6contextPruning.mode: "cache-ttl"enabled (30min TTL)contextTokensat default (200K) or high valueauto-compactionlog entry at default budget; session requires manual compaction or hits overflowWilling to provide
gateway.logexcerpts showing Sonnet auto-compactions and Opus timeout-compactions