Skip to content

Compaction fails with 1M context: max_tokens 240000 > 128000 for Anthropic models #54383

@adzendo

Description

@adzendo

Bug Report

Summary

Compaction summarization fails with max_tokens: 240000 > 128000 error when using Anthropic Claude models (Sonnet 4.6 / Opus 4.6) with 1M context windows enabled via context1m: true.

Environment

  • OpenClaw: 2026.3.23-2 (7ffe7e4)
  • OS: macOS 15.3 (arm64)
  • Model: anthropic/claude-sonnet-4-6 (also affects claude-opus-4-6)
  • Context config: contextTokens: 1000000 with context1m: true API header

Steps to Reproduce

  1. Configure an agent with 1M context:
    {
      "agents": {
        "defaults": {
          "contextTokens": 1000000,
          "compaction": {
            "model": "anthropic/claude-sonnet-4-6",
            "keepRecentTokens": 500000,
            "reserveTokensFloor": 300000,
            "maxHistoryShare": 0.75,
            "recentTurnsPreserve": 12
          }
        }
      }
    }
  2. Use the agent until context reaches ~200K+ tokens
  3. Trigger compaction via /compact
  4. Compaction fails with:
    Summarization failed: 400 {"type":"error","error":{"type":"invalid_request_error",
    "message":"max_tokens: 240000 > 128000, which is the maximum allowed number of 
    output tokens for claude-sonnet-4-6"}}
    

Root Cause Analysis

The compaction summarizer calculates an output token budget (max_tokens) that exceeds the Anthropic API per-request output cap (128K for both Sonnet 4.6 and Opus 4.6).

Key observations:

  • The built-in model catalog in provider-catalog-*.js correctly registers maxTokens: 128e3 for Anthropic Vertex models
  • The resolveNormalizedProviderModelMaxTokens() function in io-*.js does Math.min(rawMaxTokens, contextWindow) which should cap correctly
  • However, the compaction code in pi-embedded-*.js appears to calculate its own output budget independently, requesting 240K tokens which exceeds the model cap
  • The 240K value does not change when adjusting keepRecentTokens (tested 500K → 200K, same 240K error)
  • Both claude-sonnet-4-6 and claude-opus-4-6 have the same 128K per-request output limit, so switching compaction.model between them does not help

Expected Behavior

The compaction summarizer should cap max_tokens to the model's actual output limit (128K for Anthropic). If the summary would exceed this, it should either:

  1. Clamp max_tokens to the model's output ceiling, or
  2. Chunk the summarization into multiple passes that each fit within the output limit, or
  3. Use the model registry's maxTokens value when building the summarization API request

Proposed Fix

In the compaction summarization path (pi-embedded-*.js), add a clamp:

// Before sending the summarization request, cap to model's output limit
const effectiveMaxTokens = Math.min(
  calculatedOutputBudget,
  modelEntry.maxTokens ?? 128_000  // fallback to safe default
);

This is a one-line fix that prevents the API rejection while preserving the existing summarization logic for models with higher output limits.

Impact

This blocks compaction for all users running Anthropic Claude models with 1M context windows — a configuration that was introduced in OpenClaw 2026.3.22. Workaround is to use /reset instead of /compact, but this loses session continuity.

Workaround

None fully effective. /reset starts a fresh session. Adjusting keepRecentTokens does not change the 240K output request.

Labels

bug, compaction, anthropic, context-window

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions