Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models

## Environment

- **OpenClaw version:** 2026.4.15 (also reproduced on 2026.4.11)
- **Platform:** macOS (Mac Studio, native install)
- **Auth mode:** Anthropic OAuth (mode: "token")
- **Main agent model:** anthropic/claude-opus-4-6
- **Subagent / heartbeat model:** anthropic/claude-sonnet-4-5
- **Compaction model:** anthropic/claude-sonnet-4-5
- **Interface:** Discord (channels/threads as sessions)

## Relevant config

```
agents.defaults.contextTokens: 150000   (also tested at 170000, 200000)
compaction.mode: "default"
compaction.model: "anthropic/claude-sonnet-4-5"
compaction.reserveTokens: 50000
compaction.reserveTokensFloor: 30000
compaction.memoryFlush.softThresholdTokens: 15000
contextPruning.mode: "cache-ttl"   (aggressive, 30min TTL on tool results)
```

## Summary

The preemptive auto-compaction check (`shouldPreemptivelyCompactBeforePrompt`) exhibited two issues with main-agent Opus sessions:

1. **At the default 200K budget**, it never triggered preemptive compaction. Opus sessions grew to 167K+ tokens with zero compactions, requiring manual intervention or hitting actual context overflow (timeout-compaction).
2. **After lowering `contextTokens` to 150K**, preemptive compaction eventually fired — confirming the mechanism works but the threshold interaction with cache-ttl pruning and large bootstrap loads creates a window where the estimate stays artificially low.

A critical observability gap compounds the issue: `shouldPreemptivelyCompactBeforePrompt` emits **no log when `route === "fits"`**, making it impossible to distinguish "precheck ran and decided it fits" from "precheck never ran."

## Observed behavior

- Opus main-agent session climbed to 167K+ tokens (budget 200K) with zero compactions. Manual compaction required.
- Log audit of `gateway.log` (full history): prior to the fix, every `auto-compaction` event was for Sonnet. Every Opus compaction event was `timeout-compaction` (overflow recovery).
- On Apr 13, an Opus session hit context overflow and threw 15 consecutive "prompt too large" errors in rapid succession without recovery.
- After lowering `agents.defaults.contextTokens` to 150K and starting a fresh session: auto-compaction eventually fired at ~10:33 EDT on Apr 18, confirmed in gateway.log: `auto-compaction succeeded for anthropic/claude-opus-4-6; retrying prompt`
- The session `/status` initially showed "Compactions: 1" which was a carried-forward artifact from a prior session summary, not a new event. Real compaction count reached 2 after the auto-compaction fired.

## Expected behavior

Preemptive compaction check should fire reliably regardless of model, and the decision path should be observable in logs.

## Source trace / hypothesis

Traced the call site in `pi-embedded-runner`:

- Preemptive check invoked at approx line 6654/6655
- Default context engine is `LegacyContextEngine` (no `ownsCompaction` property) → bypass does NOT apply
- `contextTokenBudget` defaults to `2e5` (200K) if not explicitly passed
- `shouldPreemptivelyCompactBeforePrompt` uses `estimatePrePromptTokens` (character-based: `estimatedChars / 4`) compared against `contextTokenBudget - reserveTokens`

**Leading hypothesis:** `contextPruning.mode: "cache-ttl"` aggressively removes tool-result content from messages before `estimatePrePromptTokens` runs. The character-based estimate sees the pruned message set and stays well below the budget threshold, so the preemptive check returns `route: "fits"` indefinitely at the default 200K budget. At the reduced 150K budget, the margin is tight enough that the estimate eventually catches up.

This explains why Sonnet sessions compact correctly (different token accumulation pattern, more frequent tool calls, hit threshold between prune cycles) while Opus sessions at 200K do not (larger headroom, pruning consistently masks the estimate).

**Pruning-disabled test pending** to confirm this hypothesis.

## Observability gap

The `[context-overflow-precheck]` log string is emitted ONLY when `shouldCompact === true` or when truncation routes fire. **There is no log when `route === "fits"`** — meaning the decision to NOT compact is completely silent. This made the bug extremely difficult to diagnose.

**Request:** Add a debug-level log inside `shouldPreemptivelyCompactBeforePrompt` (or at the call site) showing `estimatedPromptTokens`, `promptBudgetBeforeReserve`, `overflowTokens`, and `route` on every evaluation. This would make token budget issues trivially diagnosable.

## Additional observations

- Session budget is resolved once at session start and cached for the session lifetime. Config changes to `contextTokens` mid-session have no effect; a fresh session is required.
- Possibly related: #56072 (daily session reset archiving without memory flush). Our install exhibits `.reset` files appearing at ~3-4 AM daily for large sessions.

## Minimal reproduction

1. Main agent model set to `anthropic/claude-opus-4-6`
2. `contextPruning.mode: "cache-ttl"` enabled (30min TTL)
3. Large bootstrap files (AGENTS.md, MEMORY.md etc.) totaling ~70K tokens at session start
4. `contextTokens` at default (200K) or high value
5. Run a long session with periodic tool use; let context accumulate
6. Observe: no `auto-compaction` log entry at default budget; session requires manual compaction or hits overflow

## Willing to provide

- Config (redacted)
- `gateway.log` excerpts showing Sonnet auto-compactions and Opus timeout-compactions
- Source-trace notes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models #68609

Environment

Relevant config

Summary

Observed behavior

Expected behavior

Source trace / hypothesis

Observability gap

Additional observations

Minimal reproduction

Willing to provide

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models #68609

Description

Environment

Relevant config

Summary

Observed behavior

Expected behavior

Source trace / hypothesis

Observability gap

Additional observations

Minimal reproduction

Willing to provide

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions