fix: clamp compaction max_tokens to model output limit by adzendo · Pull Request #54392 · openclaw/openclaw

adzendo · 2026-03-25T09:25:39Z

Summary

Fixes #54383 — Compaction fails with max_tokens: 240000 > 128000 when using Anthropic models with 1M context windows.

Root Cause

In @mariozechner/pi-coding-agent, generateSummary() calculates:

const maxTokens = Math.floor(0.8 * reserveTokens);

With reserveTokensFloor: 300000 (appropriate for 1M context), this produces max_tokens = 240000 — exceeding Anthropic's per-request output cap of 128K for both Sonnet 4.6 and Opus 4.6.

Fix

Clamp reserveTokens in src/agents/compaction.ts before passing to generateSummary():

const modelMaxTokens = params.model.maxTokens ?? 128_000;
const clampedReserveTokens = Math.min(params.reserveTokens, Math.floor(modelMaxTokens / 0.8));

This ensures the downstream max_tokens calculation (0.8 * reserveTokens) never exceeds the model's actual output limit. The fix uses model.maxTokens from the provider registry, so it's forward-compatible — if future models raise their output cap, no code change is needed.

Impact

Before: Compaction broken for all users with Anthropic + 1M context (any reserveTokensFloor > 160K)
After: Compaction works correctly, respecting model output limits while preserving the existing summarization quality

Real behavior proof

Behavior or issue addressed: Compaction with Anthropic 1M-context models used to send max_tokens: 240000, exceeding the 128000 output-token limit.
Real environment tested: Fresh temporary OpenClaw state directories on this checkout, using anthropic/claude-sonnet-4-6, 1000000 context tokens, 300000 configured reserve tokens, and a local Anthropic-compatible capture server that records real provider request bodies and rejects max_tokens > 128000.
Exact steps or command run after this patch:

pnpm tsx scripts/e2e/compaction-max-tokens-proof.mjs --repo ../.. --expect fail --label main-local --mode summarize --timeout-ms 45000 --output .artifacts/pr-54392/compaction-proof-main-local.json
pnpm tsx scripts/e2e/compaction-max-tokens-proof.mjs --repo . --expect pass --label pr-local --mode summarize --timeout-ms 45000 --output .artifacts/pr-54392/compaction-proof-pr-local.json

Evidence after fix: Terminal output from the PR-head run captured provider max_tokens as [128000] at 0aeb92a3a7c7; the same harness against current main 2d97dcebb5fc captured [240000, 240000, 240000] and the fake Anthropic endpoint rejected 240000 > 128000.
Observed result after fix: The real OpenClaw wrapper path (src/agents/compaction.ts -> summarizeInStages -> Pi generateSummary -> provider request) sent max_tokens: 128000, which stays within the model output limit. The clamp feeds Pi a 160000 reserve-token value so its downstream Math.floor(0.8 * reserveTokens) calculation cannot exceed 128000.
What was not tested: Full embedded manual compaction mode in CI; the checked proof covers the provider request boundary that regressed.

Testing

Real behavior proof was run twice from fresh temporary OpenClaw state directories with a local Anthropic-compatible capture server. The harness drives the real OpenClaw wrapper path (src/agents/compaction.ts -> summarizeInStages -> Pi generateSummary -> provider request) using anthropic/claude-sonnet-4-6, 1000000 context tokens, 300000 configured reserve tokens, and a 128000 model output limit.

Temporary proof commands used locally:

pnpm tsx scripts/e2e/compaction-max-tokens-proof.mjs --repo ../.. --expect fail --label main-local --mode summarize --timeout-ms 45000 --output .artifacts/pr-54392/compaction-proof-main-local.json
pnpm tsx scripts/e2e/compaction-max-tokens-proof.mjs --repo . --expect pass --label pr-local --mode summarize --timeout-ms 45000 --output .artifacts/pr-54392/compaction-proof-pr-local.json

Ref	Expected	Captured provider `max_tokens`	Result
current main `2d97dce`	fail	`[240000, 240000, 240000]`	fake Anthropic endpoint rejected `240000 > 128000`
PR head 0aeb92a3a7c7	pass	`[128000]`	provider request stayed within the `128000` output limit

This proves the regression at the provider request boundary: without the clamp, OpenClaw asks Anthropic for 240000 output tokens; with this PR, the downstream Pi calculation receives a clamped reserve token value of 160000 and emits 128000 output tokens.

greptile-apps · 2026-03-25T09:29:17Z

Greptile Summary

This PR fixes a bug where compaction fails with a max_tokens exceeds model limit error when using Anthropic models with 1M-token context windows. The root cause is that generateSummary() derives max_tokens as Math.floor(0.8 * reserveTokens), and with a reserveTokensFloor of 300 000 the result (240 000) exceeds Anthropic's per-request output cap of 128 K.

Changes:

Adds a clamp in summarizeChunks before passing reserveTokens to generateSummary: Math.min(reserveTokens, Math.floor(model.maxTokens / 0.8)). The math is correct and the ?? 128_000 fallback is appropriate for models that do not advertise their cap.
The fix is applied at the one callsite where generateSummary is invoked, so summarizeWithFallback and summarizeInStages both benefit automatically without changes.
No test is added for the new clamping path. The existing mock-based test files (compaction.identifier-preservation.test.ts) provide exactly the right infrastructure to verify this behaviour, and the codebase guidelines (CLAUDE.md) ask for narrowly-scoped tests for narrowly-scoped changes.

Confidence Score: 4/5

Safe to merge; the fix is correct and well-reasoned, with a single non-blocking follow-up (adding a scoped test) remaining.
The arithmetic is sound, the fallback is appropriate, and the fix is applied at the right abstraction level. The only gap is missing test coverage for the new clamping path, which the codebase guidelines explicitly require for narrowly-scoped changes. This is a P2 style item and does not affect correctness.
No files require special attention beyond adding a test in src/agents/compaction.identifier-preservation.test.ts.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/compaction.ts
Line: 240-241

Comment:
**Missing test for the new clamping behaviour**

The clamping logic is the sole change in this PR and is the direct fix for the reported bug, but no test validates that it works. Both `compaction.identifier-preservation.test.ts` and `compaction.retry.test.ts` already mock `generateSummary` and call into `summarizeInStages` via `summarizeWithFallback`, so the test infrastructure is already in place.

Per `CLAUDE.md`, narrowly-scoped changes should include narrowly-scoped tests that directly validate the touched behaviour.

A good place for this is `compaction.identifier-preservation.test.ts` — pass a model with `maxTokens: 128_000` and `reserveTokens: 300_000`, then assert that the third argument passed to the `mockGenerateSummary` spy is ≤ `Math.floor(128_000 / 0.8)` (160 000). This directly guards the invariant described in the PR and catches future regressions if the `0.8` multiplier inside `generateSummary` ever changes.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix: clamp compaction max_tokens to mode..." | Re-trigger Greptile}

greptile-apps · 2026-03-25T09:29:21Z

+  const modelMaxTokens = params.model.maxTokens ?? 128_000;
+  const clampedReserveTokens = Math.min(params.reserveTokens, Math.floor(modelMaxTokens / 0.8));


Missing test for the new clamping behaviour

The clamping logic is the sole change in this PR and is the direct fix for the reported bug, but no test validates that it works. Both compaction.identifier-preservation.test.ts and compaction.retry.test.ts already mock generateSummary and call into summarizeInStages via summarizeWithFallback, so the test infrastructure is already in place.

Per CLAUDE.md, narrowly-scoped changes should include narrowly-scoped tests that directly validate the touched behaviour.

A good place for this is compaction.identifier-preservation.test.ts — pass a model with maxTokens: 128_000 and reserveTokens: 300_000, then assert that the third argument passed to the mockGenerateSummary spy is ≤ Math.floor(128_000 / 0.8) (160 000). This directly guards the invariant described in the PR and catches future regressions if the 0.8 multiplier inside generateSummary ever changes.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/agents/compaction.ts Line: 240-241 Comment: **Missing test for the new clamping behaviour** The clamping logic is the sole change in this PR and is the direct fix for the reported bug, but no test validates that it works. Both `compaction.identifier-preservation.test.ts` and `compaction.retry.test.ts` already mock `generateSummary` and call into `summarizeInStages` via `summarizeWithFallback`, so the test infrastructure is already in place. Per `CLAUDE.md`, narrowly-scoped changes should include narrowly-scoped tests that directly validate the touched behaviour. A good place for this is `compaction.identifier-preservation.test.ts` — pass a model with `maxTokens: 128_000` and `reserveTokens: 300_000`, then assert that the third argument passed to the `mockGenerateSummary` spy is ≤ `Math.floor(128_000 / 0.8)` (160 000). This directly guards the invariant described in the PR and catches future regressions if the `0.8` multiplier inside `generateSummary` ever changes. How can I resolve this? If you propose a fix, please make it concise.

adzendo · 2026-03-25T09:44:05Z

Added a dedicated test file src/agents/compaction.reserve-tokens-clamping.test.ts addressing the review feedback (e6e57c6).

Four tests covering:

Core invariant: reserveTokens=300K with maxTokens=128K clamps to exactly Math.floor(128_000 / 0.8) = 160_000
Pass-through: reserveTokens already within bounds passes unchanged
Default fallback: model without maxTokens field falls back to 128_000 cap
Staged summarization: clamping is consistent across all chunks in split+merge paths

All 4 new tests + 21 existing compaction tests pass. Pre-commit hooks (pnpm check — format, lint, type-aware, 26 boundary checks) all clean.

Point the stale contract-test mock at the current Codex runtime module and instantiate the provider directly for the one mock-sensitive fallback case. This keeps the fallback covered without changing production refresh behavior or the separate registry contract coverage. Regeneration-Prompt: | Fix MAR-231, the unrelated prep blocker on PR #54392. The failing surface was src/plugins/contracts/runtime.contract.test.ts for the OpenAI Codex refresh fallback case. Keep production behavior unchanged. The contract loader uses Jiti for bundled providers, which bypasses the Vitest mock of the runtime wrapper and caused the test to hit the real OAuth refresh path. Update the stale runtime mock path and make only that fallback assertion use the direct provider module, while leaving registry coverage in src/plugins/contracts/registry.contract.test.ts intact. Validate with the focused Codex runtime contract test and the registry contract test.

With 1M context windows, reserveTokensFloor can be 300K+. The generateSummary() function in pi-coding-agent calculates max_tokens as Math.floor(0.8 * reserveTokens), producing 240K — which exceeds Anthropic's per-request output cap of 128K for Sonnet/Opus 4.6. This fix clamps reserveTokens before passing to generateSummary(), ensuring the resulting max_tokens never exceeds the model's maxTokens. The clamp uses model.maxTokens from the provider registry (falls back to 128K if unset). This is forward-compatible — if future models raise their output cap, no code change is needed. Fixes openclaw#54383

Validates that summarizeChunks clamps reserveTokens to Math.floor(model.maxTokens / 0.8) to prevent max_tokens from exceeding the model's output limit. Covers: - Clamping when reserveTokens (300K) exceeds model output cap (128K) - Pass-through when reserveTokens is already within bounds - Fallback to 128K default when model has no maxTokens field - Consistent clamping across all chunks in staged summarization

jalehman · 2026-05-06T21:42:31Z

Merged via squash.

Prepared head SHA: 8a888213edda9552be694d6d4c9d321bce0e3e50
Merge commit: ac4313598472fe1c0ea1cce24128337f6808138d

Thanks @adzendo!

@jalehman

Merged via squash. Prepared head SHA: 8a88821 Co-authored-by: adzendo <246828680+adzendo@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman

@jalehman

Merged via squash. Prepared head SHA: 8a88821 Co-authored-by: adzendo <246828680+adzendo@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman (cherry picked from commit ac43135)

@jalehman

Merged via squash. Prepared head SHA: 8a88821 Co-authored-by: adzendo <246828680+adzendo@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman

@jalehman

Merged via squash. Prepared head SHA: 8a88821 Co-authored-by: adzendo <246828680+adzendo@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman

openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 25, 2026

greptile-apps Bot reviewed Mar 25, 2026

View reviewed changes

openclaw-barnacle Bot added size: S and removed size: XS labels Mar 25, 2026

jalehman force-pushed the fix/compaction-max-tokens-cap branch from e6e57c6 to 32fe559 Compare March 26, 2026 22:12

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 32fe559 to a494a03 Compare March 26, 2026 22:16

jalehman force-pushed the fix/compaction-max-tokens-cap branch from a494a03 to 463704b Compare March 26, 2026 23:55

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 463704b to 573bb94 Compare March 27, 2026 00:04

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 573bb94 to cc271e0 Compare March 27, 2026 00:09

jalehman force-pushed the fix/compaction-max-tokens-cap branch from cc271e0 to 7dc046d Compare March 27, 2026 01:21

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 7dc046d to 5d9f719 Compare March 27, 2026 01:24

jalehman mentioned this pull request Mar 27, 2026

test: fix Codex contract refresh fallback #55524

Closed

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 5d9f719 to 5f1e51c Compare March 29, 2026 00:00

jalehman added a commit to adzendo/openclaw that referenced this pull request Mar 29, 2026

fix: document compaction clamp for openclaw#54392 thanks @adzendo

5f1e51c

jalehman added a commit to adzendo/openclaw that referenced this pull request Mar 30, 2026

fix: document compaction clamp for openclaw#54392 thanks @adzendo

e6df16c

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 5f1e51c to e6df16c Compare March 30, 2026 22:51

This was referenced Apr 26, 2026

Compaction max_tokens exceeds model output limit, crashes gateway #61852

Closed

Compaction fails with 1M context: max_tokens 240000 > 128000 for Anthropic models #54383

Closed

BingqingLyu mentioned this pull request Apr 27, 2026

fix: clamp compaction max_tokens to model output limit BingqingLyu/openclaw#1385

Open

jalehman force-pushed the fix/compaction-max-tokens-cap branch from f745183 to 6e3098c Compare May 2, 2026 01:26

jalehman self-assigned this May 6, 2026

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 6e3098c to 0aeb92a Compare May 6, 2026 18:20

openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 6, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 6, 2026

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 0aeb92a to 1bc3f8e Compare May 6, 2026 19:41

jalehman added a commit to adzendo/openclaw that referenced this pull request May 6, 2026

fix: document compaction clamp for openclaw#54392 thanks @adzendo

d182aef

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 6, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 6, 2026

FORGE and others added 6 commits May 6, 2026 14:40

fix: document compaction clamp for openclaw#54392 thanks @adzendo

0049fee

test: align compaction summary mocks

c6588b2

docs: move compaction changelog entry

751d9fe

docs: move compaction changelog entry

8a88821

jalehman force-pushed the fix/compaction-max-tokens-cap branch from 1bc3f8e to 8a88821 Compare May 6, 2026 21:41

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 6, 2026

jalehman merged commit ac43135 into openclaw:main May 6, 2026
86 of 87 checks passed

github-actions Bot mentioned this pull request May 6, 2026

📡 Upstream Digest — 2026-05-06 22:41 UTC curtismercier/openclaw-mods#787

Open

github-actions Bot mentioned this pull request May 9, 2026

Update ghcr.io/openclaw/openclaw Docker tag to v2026.5.7 - autoclosed claytono/infra#1977

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: clamp compaction max_tokens to model output limit#54392

fix: clamp compaction max_tokens to model output limit#54392
jalehman merged 6 commits intoopenclaw:mainfrom
adzendo:fix/compaction-max-tokens-cap

adzendo commented Mar 25, 2026 •

edited by jalehman

Loading

Uh oh!

greptile-apps Bot commented Mar 25, 2026

Uh oh!

greptile-apps Bot Mar 25, 2026

Uh oh!

adzendo commented Mar 25, 2026

Uh oh!

Uh oh!

jalehman commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		const modelMaxTokens = params.model.maxTokens ?? 128_000;
		const clampedReserveTokens = Math.min(params.reserveTokens, Math.floor(modelMaxTokens / 0.8));

Uh oh!

Conversation

adzendo commented Mar 25, 2026 • edited by jalehman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Fix

Impact

Real behavior proof

Testing

Uh oh!

greptile-apps Bot commented Mar 25, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps Bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

adzendo commented Mar 25, 2026

Uh oh!

Uh oh!

jalehman commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adzendo commented Mar 25, 2026 •

edited by jalehman

Loading