Bug Report
Summary
Compaction summarization fails with max_tokens: 240000 > 128000 error when using Anthropic Claude models (Sonnet 4.6 / Opus 4.6) with 1M context windows enabled via context1m: true.
Environment
- OpenClaw: 2026.3.23-2 (7ffe7e4)
- OS: macOS 15.3 (arm64)
- Model:
anthropic/claude-sonnet-4-6 (also affects claude-opus-4-6)
- Context config:
contextTokens: 1000000 with context1m: true API header
Steps to Reproduce
- Configure an agent with 1M context:
{
"agents": {
"defaults": {
"contextTokens": 1000000,
"compaction": {
"model": "anthropic/claude-sonnet-4-6",
"keepRecentTokens": 500000,
"reserveTokensFloor": 300000,
"maxHistoryShare": 0.75,
"recentTurnsPreserve": 12
}
}
}
}
- Use the agent until context reaches ~200K+ tokens
- Trigger compaction via
/compact
- Compaction fails with:
Summarization failed: 400 {"type":"error","error":{"type":"invalid_request_error",
"message":"max_tokens: 240000 > 128000, which is the maximum allowed number of
output tokens for claude-sonnet-4-6"}}
Root Cause Analysis
The compaction summarizer calculates an output token budget (max_tokens) that exceeds the Anthropic API per-request output cap (128K for both Sonnet 4.6 and Opus 4.6).
Key observations:
- The built-in model catalog in
provider-catalog-*.js correctly registers maxTokens: 128e3 for Anthropic Vertex models
- The
resolveNormalizedProviderModelMaxTokens() function in io-*.js does Math.min(rawMaxTokens, contextWindow) which should cap correctly
- However, the compaction code in
pi-embedded-*.js appears to calculate its own output budget independently, requesting 240K tokens which exceeds the model cap
- The 240K value does not change when adjusting
keepRecentTokens (tested 500K → 200K, same 240K error)
- Both
claude-sonnet-4-6 and claude-opus-4-6 have the same 128K per-request output limit, so switching compaction.model between them does not help
Expected Behavior
The compaction summarizer should cap max_tokens to the model's actual output limit (128K for Anthropic). If the summary would exceed this, it should either:
- Clamp
max_tokens to the model's output ceiling, or
- Chunk the summarization into multiple passes that each fit within the output limit, or
- Use the model registry's
maxTokens value when building the summarization API request
Proposed Fix
In the compaction summarization path (pi-embedded-*.js), add a clamp:
// Before sending the summarization request, cap to model's output limit
const effectiveMaxTokens = Math.min(
calculatedOutputBudget,
modelEntry.maxTokens ?? 128_000 // fallback to safe default
);
This is a one-line fix that prevents the API rejection while preserving the existing summarization logic for models with higher output limits.
Impact
This blocks compaction for all users running Anthropic Claude models with 1M context windows — a configuration that was introduced in OpenClaw 2026.3.22. Workaround is to use /reset instead of /compact, but this loses session continuity.
Workaround
None fully effective. /reset starts a fresh session. Adjusting keepRecentTokens does not change the 240K output request.
Labels
bug, compaction, anthropic, context-window
Bug Report
Summary
Compaction summarization fails with
max_tokens: 240000 > 128000error when using Anthropic Claude models (Sonnet 4.6 / Opus 4.6) with 1M context windows enabled viacontext1m: true.Environment
anthropic/claude-sonnet-4-6(also affectsclaude-opus-4-6)contextTokens: 1000000withcontext1m: trueAPI headerSteps to Reproduce
{ "agents": { "defaults": { "contextTokens": 1000000, "compaction": { "model": "anthropic/claude-sonnet-4-6", "keepRecentTokens": 500000, "reserveTokensFloor": 300000, "maxHistoryShare": 0.75, "recentTurnsPreserve": 12 } } } }/compactRoot Cause Analysis
The compaction summarizer calculates an output token budget (
max_tokens) that exceeds the Anthropic API per-request output cap (128K for both Sonnet 4.6 and Opus 4.6).Key observations:
provider-catalog-*.jscorrectly registersmaxTokens: 128e3for Anthropic Vertex modelsresolveNormalizedProviderModelMaxTokens()function inio-*.jsdoesMath.min(rawMaxTokens, contextWindow)which should cap correctlypi-embedded-*.jsappears to calculate its own output budget independently, requesting 240K tokens which exceeds the model capkeepRecentTokens(tested 500K → 200K, same 240K error)claude-sonnet-4-6andclaude-opus-4-6have the same 128K per-request output limit, so switchingcompaction.modelbetween them does not helpExpected Behavior
The compaction summarizer should cap
max_tokensto the model's actual output limit (128K for Anthropic). If the summary would exceed this, it should either:max_tokensto the model's output ceiling, ormaxTokensvalue when building the summarization API requestProposed Fix
In the compaction summarization path (
pi-embedded-*.js), add a clamp:This is a one-line fix that prevents the API rejection while preserving the existing summarization logic for models with higher output limits.
Impact
This blocks compaction for all users running Anthropic Claude models with 1M context windows — a configuration that was introduced in OpenClaw 2026.3.22. Workaround is to use
/resetinstead of/compact, but this loses session continuity.Workaround
None fully effective.
/resetstarts a fresh session. AdjustingkeepRecentTokensdoes not change the 240K output request.Labels
bug, compaction, anthropic, context-window