[Bug]: Context overflow recovery skips tool-result truncation when multiple medium-sized results collectively overflow

## Bug

Context overflow recovery has a blind spot: `sessionLikelyHasOversizedToolResults()` checks **individual** tool result sizes against a threshold (30% of context window × 4), but multiple medium-sized results that collectively overflow the context are not detected. This causes the recovery to skip truncation and proceed directly to session reset — destroying all conversation history.

## Version

`2026.2.12`

## Steps to Reproduce

1. Use a 200K context model (e.g., Claude Opus)
2. Build up context to ~77% capacity (in our case via `load-topic` loading 142 daily notes)
3. In a single assistant turn, read 3 email threads via parallel tool calls:
   - Thread A: 38,662 chars
   - Thread B: 155,016 chars  
   - Thread C: 136,693 chars
4. Combined: 330,371 chars (~82K tokens) added in one step
5. Context jumps from 154,535 → 390,607 tokens (2× the 200K limit)

## Expected Behavior

Recovery should truncate the oversized tool results and retry:
1. ✅ Auto-compaction attempted → fails (even the chunk is 238K > 200K)
2. ❌ Tool result truncation **should** detect that aggregate tool results are too large
3. Truncate the 3 large results and retry
4. Only reset session as absolute last resort

## Actual Behavior

1. ✅ Auto-compaction attempted → fails (238K prefix > 200K)
2. ❌ `sessionLikelyHasOversizedToolResults()` returns `false` because no **individual** result exceeds the threshold
3. ❌ Truncation skipped entirely
4. 💥 Session reset — all history destroyed

## Root Cause Analysis

```javascript
// pi-embedded-DxwVpEx9.js
const MAX_TOOL_RESULT_CONTEXT_SHARE = 0.3;
const HARD_MAX_TOOL_RESULT_CHARS = 400000;

function calculateMaxToolResultChars(contextWindowTokens) {
    return Math.min(
        Math.floor(contextWindowTokens * 0.3) * 4,  // = 240,000 for 200K window
        400000
    );
}

function sessionLikelyHasOversizedToolResults({ messages, contextWindowTokens }) {
    const maxChars = calculateMaxToolResultChars(contextWindowTokens);
    for (const msg of messages) {
        if (msg.role !== "toolResult") continue;
        if (getToolResultTextLength(msg) > maxChars) return true;  // ← individual check only
    }
    return false;  // ← returns false even if aggregate far exceeds context
}
```

For a 200K context window, the threshold is **240,000 chars per result**. Our three results (39K, 155K, 137K) are all individually under this threshold, but combined they add ~330K chars (~82K tokens) to an already-full context.

## Gateway Logs (evidence)

```
11:16:48.205Z [context-overflow-diag] 390607 tokens > 200000 max, compactionAttempts=0
11:16:48.207Z context overflow detected (attempt 1/3); attempting auto-compaction
11:16:48.834Z auto-compaction failed: prefix summarization 238273 tokens > 200000 max
11:16:48.861Z Restarting session → new session ID
```

No `[context-overflow-recovery] Attempting tool result truncation` log line — confirming truncation was never entered.

## Suggested Fix

`sessionLikelyHasOversizedToolResults()` should also check **aggregate** tool result size:

```javascript
function sessionLikelyHasOversizedToolResults({ messages, contextWindowTokens }) {
    const maxChars = calculateMaxToolResultChars(contextWindowTokens);
    let totalToolResultChars = 0;
    for (const msg of messages) {
        if (msg.role !== "toolResult") continue;
        const len = getToolResultTextLength(msg);
        if (len > maxChars) return true;  // individual check
        totalToolResultChars += len;
    }
    // Also flag if aggregate tool results exceed context budget
    const aggregateMax = contextWindowTokens * 4;  // full context in chars
    return totalToolResultChars > aggregateMax * 0.5;  // >50% of context is tool results
}
```

Additionally, the truncation logic in `truncateOversizedToolResultsInSession` should be able to truncate the **largest** tool results even if none individually cross the threshold, when the aggregate is causing overflow.

## Related Issues

- #394 — Original request for 413 handling (basic UX fix only, no auto-recovery)
- #9140 — Unbounded tool outputs causing overflow (fixed by #11664, but fix only covers individual results)
- #13097 — Context overflow warning before hitting limit
- #14818 — Umbrella discussion on overflow prevention
- #10099 — Compaction safeguard losing messages on summarization failure

## Environment

- OpenClaw 2026.2.12
- Model: anthropic/claude-opus-4-6 (200K context)
- Compaction mode: safeguard
- Session pruning: cache-ttl (5min)
- PR #11664 (tool-result truncation fix) was included but did not cover this scenario

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Context overflow recovery skips tool-result truncation when multiple medium-sized results collectively overflow #15409

Bug

Version

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Gateway Logs (evidence)

Suggested Fix

Related Issues

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Context overflow recovery skips tool-result truncation when multiple medium-sized results collectively overflow #15409

Description

Bug

Version

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Gateway Logs (evidence)

Suggested Fix

Related Issues

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions