Bug: Sudden Context Overflow from Large Tool Results (no preflight guard)

# Bug: Sudden Context Overflow from Large Tool Results (no preflight guard)

## Summary

OpenClaw can exceed model context in a single step when a tool returns a very large payload.  
After that, requests fail repeatedly with:

- `prompt is too long: ... > 200000`
- `input length and max_tokens exceed context limit: ... + 32000 > 200000`

In several cases, no automatic compaction happened before the next request, so sessions got stuck in a failure loop.

## Environment

- OpenClaw: `2026.1.30`
- Model: `anthropic/claude-sonnet-4-20250514` and `anthropic/claude-opus-4-5`
- Context window: `200000`

## What happened

Large single `toolResult` messages were appended directly to session transcript and immediately pushed next prompt over limit.

### Concrete examples (from local transcripts)

1. `gateway config.schema` produced a huge payload:
- transcript: `sessions/d3a0a84a-ab0f-4597-9de4-8353a67acfec.jsonl`
- tool call line: `1402` (`gateway {"action":"config.schema"}`)
- tool result line: `1403` (~560k chars JSON line)
- immediate error: line `1404` (`prompt is too long: 205607 tokens > 200000`)

2. Grepping session logs with `tail ... ~/.openclaw/agents/main/sessions/*.jsonl` produced huge outputs:
- transcript: `sessions/d3a0a84a-ab0f-4597-9de4-8353a67acfec.jsonl`
- tool calls: lines `1857`, `1869`
- tool results: lines `1858` (~400k chars), `1870` (~409k chars)
- immediate failures: lines `1859`, `1871`, `1874`, `1877`

3. Same pattern in another session:
- transcript: `sessions/276d664b-06c9-4a1a-80ad-562972d25beb.jsonl`
- huge tool results on lines `32` and `34` (~416k chars each)
- repeated context-limit errors after that

4. Repeated failures at high context without successful recovery:
- transcript: `sessions/7bff962a-ac31-44f3-9c17-300507453484.jsonl`
- repeated errors on lines `955`, `958`, `961`, `964`, `967`, `970`
- message: `input length and max_tokens exceed context limit: 173629 + 32000 > 200000`

## Expected behavior

Before every model request, OpenClaw should guarantee:

- `estimated_input_tokens + requested_max_tokens <= model_context_limit`

If not true, OpenClaw should automatically do one or more of:

1. trigger compaction preflight,
2. reduce `max_tokens` dynamically,
3. truncate/summarize newest oversized tool results,
4. retry automatically with safe budget.

The session should not enter a repeated hard-failure loop.

## Suggested fixes

1. **Preflight context budget check** before each LLM call, based on current transcript + planned `max_tokens`.
2. **Hard output cap per tool result** inserted into transcript (character and token budget).
3. **Auto-summarize oversized tool outputs** (store full output externally/log file, keep summary in context).
4. **Guardrails for high-risk commands** that can return giant transcript chunks (e.g. `tail ~/.openclaw/agents/main/sessions/*.jsonl`).
5. **Fallback retry policy**:
- on `input + max_tokens` error: reduce `max_tokens` and retry,
- if still too large: force compaction and retry once.
6. **Optional warning mode** when any single toolResult exceeds configured threshold (e.g. 10k/20k tokens).

## Why this matters

This causes:

- session stalls,
- 5x+ token usage spikes (many failed retries),
- broken automation loops until manual intervention.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug: Sudden Context Overflow from Large Tool Results (no preflight guard) #10694

Bug: Sudden Context Overflow from Large Tool Results (no preflight guard)

Summary

Environment

What happened

Concrete examples (from local transcripts)

Expected behavior

Suggested fixes

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Bug: Sudden Context Overflow from Large Tool Results (no preflight guard) #10694

Description

Bug: Sudden Context Overflow from Large Tool Results (no preflight guard)

Summary

Environment

What happened

Concrete examples (from local transcripts)

Expected behavior

Suggested fixes

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions