Skip to content

fix: cap effective session context window by known model limit#39859

Closed
xdanger wants to merge 11 commits into
openclaw:mainfrom
xdanger:fix/status-context-window-cap
Closed

fix: cap effective session context window by known model limit#39859
xdanger wants to merge 11 commits into
openclaw:mainfrom
xdanger:fix/status-context-window-cap

Conversation

@xdanger

@xdanger xdanger commented Mar 8, 2026

Copy link
Copy Markdown
Contributor

Fixes #39857

Summary

  • Problem: resolveContextTokensForModel() could return an override value even when OpenClaw already knew the selected model's smaller context window.
  • Why it matters: that inflated value could be persisted into session metadata and surfaced in /status, so OpenClaw would over-report the effective context window for the active model.
  • What changed: when both values are known, the effective window is now min(override, modelWindow); provider aliases are also normalized before reading configured model windows.
  • What did NOT change: if the model window is unknown, override behavior stays the same; this does not change provider-side request limits or widen any model window.

Problem

This was a correctness bug in the effective context-window resolver, not just a display nit.

Before this PR, the resolver could treat agents.defaults.contextTokens (or another context override) as authoritative even when the selected model's real window was already known and smaller.

That meant OpenClaw could persist and report a context window that the active model does not actually have.

Affected paths include places that read the resolved contextTokens, such as:

  • session store metadata
  • /status
  • other callers that rely on persisted/effective session context

Example

If a user sets:

  • agents.defaults.contextTokens: 1048000

and then uses a model whose known context window is smaller, OpenClaw could previously keep reporting 1048000 instead of the model's real effective limit.

The correct result is the smaller effective window.

Fix

The functional change is intentionally narrow:

  • if an override exists and the selected model window is also known, return the smaller value
  • if the model window is unknown, preserve the previous override behavior
  • when reading configured model windows from cfg.models.providers, normalize provider names first so aliases like bedrock still match configured entries like amazon-bedrock

Why this is safe

This is fail-safe and correctness-preserving:

  • it only removes overstatement
  • it never increases an effective context window
  • it does not change behavior when OpenClaw does not know the model's real window

Validation

  • bunx vitest run src/agents/context.test.ts src/agents/models-config.merge.test.ts
  • bunx vitest run src/plugins/loader.test.ts
  • pnpm check

Scope note

A few later commits in this PR are CI-only follow-ups caused by main moving during review. The product fix itself remains limited to effective context-window resolution and regression coverage for that bug.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Mar 8, 2026
@greptile-apps

greptile-apps Bot commented Mar 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes resolveContextTokensForModel so that a user-configured contextTokensOverride is capped to the model's actual known context window when that window is smaller, preventing /status and session metadata from over-reporting available context. The implementation resolves modelTokens first (from the context1m Anthropic flag, the inline config lookup, or the shared MODEL_CACHE), and then returns Math.min(override, modelTokens) when both values are available.

Key observations:

  • The production logic in context.ts is correct — the capping, fallback chain, and existing edge-cases (no provider info, override absent, Anthropic 1M flag) all behave as expected.
  • The second new test ("prefers lower override when it is already below the model context window") does not exercise the described scenario. No cfg is supplied, so modelTokens is always undefined and the Math.min branch is never reached; the test passes trivially. A cfg with the model's window set to a value larger than the override is needed to actually validate the uncapped path.

Confidence Score: 4/5

  • Safe to merge — the production logic is sound; only the second regression test needs a fix to actually prove what it claims.
  • The core implementation change is correct and well-reasoned. The single meaningful gap is that one of the two new regression tests doesn't truly exercise the "override already below model window" branch because no model window is configured, so the Math.min comparison is never evaluated. This leaves that code path without a genuine automated check, reducing confidence slightly.
  • src/agents/context.test.ts — the second new test case needs a cfg with the model's context window configured to actually validate the uncapped behaviour.

Last reviewed commit: cb318fa

Comment thread src/agents/context.test.ts
@xdanger xdanger changed the title fix: cap context override by model window fix: cap effective context window by known model limit Mar 8, 2026
@openclaw-barnacle openclaw-barnacle Bot added the app: macos App: macos label Mar 8, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 46f30c0d52

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/context.ts Outdated
@xdanger

xdanger commented Mar 8, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the current review feedback in f89ecbb82.

What changed:

  • updated src/agents/context.test.ts so the lower-override case now exercises the configured model-window branch instead of passing trivially
  • normalized provider matching in src/agents/context.ts before reading cfg.models.providers, so aliases like bedrock still cap against configured context windows
  • added a regression test covering that provider-alias path

Local validation run:

  • ./node_modules/.bin/vitest run src/agents/context.test.ts
  • ./node_modules/.bin/oxfmt --check src/agents/context.ts src/agents/context.test.ts

I also reran a local pnpm tsgo-equivalent check for these files; the only remaining local type errors are the pre-existing extensions/tlon missing @tloncorp/api issue, unrelated to this PR.

@openclaw-barnacle openclaw-barnacle Bot removed the app: macos App: macos label Mar 8, 2026
@xdanger xdanger force-pushed the fix/status-context-window-cap branch from e4c6ce8 to 2ddb4dd Compare March 8, 2026 19:48
@xdanger

xdanger commented Mar 8, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto current main (92726d986) and resolved the fixture conflicts from the rebase.

Addressed both review comments:

  • updated the lower-override test so it exercises the known-model-window cap path
  • normalized provider lookup before reading configured model windows, with alias coverage

Local verification passed:

  • bunx vitest run src/agents/context.test.ts src/agents/models-config.merge.test.ts
  • pnpm check

All review threads are resolved. Ready for /prepare-pr once the refreshed CI run is green.

@xdanger

xdanger commented Mar 8, 2026

Copy link
Copy Markdown
Contributor Author

CI is green again.

Follow-ups since the last note:

  • increased the timeout for the plugin loader legacy root-import smoke test, which was timing out under the full matrix while passing reliably in isolation
  • refreshed .secrets.baseline so the secrets job stops failing on line-number drift

Ready for /prepare-pr.

@xdanger

xdanger commented Mar 8, 2026

Copy link
Copy Markdown
Contributor Author

Synced with latest main, resolved the resulting workflow/baseline conflicts, and the refreshed CI matrix is now fully green on head 4f5c70385.

Ready for /prepare-pr.

@xdanger

xdanger commented Mar 8, 2026

Copy link
Copy Markdown
Contributor Author

/prepare-pr

@xdanger xdanger changed the title fix: cap effective context window by known model limit fix: cap effective session context window by known model limit Mar 9, 2026
@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@clawsweeper

clawsweeper Bot commented Apr 26, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR changes the agent context-token resolver and tests to cap a configured override by a known model window, and it extends one legacy plugin loader test timeout.

Reproducibility: yes. source-level. Current main still has override-first shared resolver, session-store, and status-summary paths, so a configured override above a known model window can still be persisted or reported before the smaller model limit is considered.

PR rating
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Summary: The bug signal is useful, but missing real behavior proof plus stale, incomplete, merge-conflicting code keeps the PR below merge-ready quality.

Rank-up moves:

  • Add redacted real behavior proof showing after-fix /status or session metadata no longer over-reports the context window.
  • Rebase or replace the branch against current main while preserving the current configured-provider lookup contract.
  • Cover the shared resolver, session persistence, and status summary paths in focused regression tests.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs real behavior proof before merge: The PR body and comments list tests/CI only; it still needs redacted after-fix proof such as terminal output, logs, a screenshot, or a recording showing /status or session metadata no longer over-reports the context window. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • The branch is merge-conflicting; a mechanical rebase could drop current safeguards for provider-level context caps, exact-before-alias lookup, slash-containing model IDs, and cache-loading behavior.
  • The claimed /status and session metadata fix is incomplete unless the separate session-store and status-summary override-first paths are capped or deduplicated.
  • Contributor real behavior proof is still absent; tests and CI do not show after-fix /status or session persistence in a real setup.

Maintainer options:

  1. Rebase Around Current Resolver Contracts (recommended)
    Refresh the branch so the clamp wraps current provider-aware lookup helpers and adds coverage for shared resolver, session persistence, and status summary behavior.
  2. Replace With A Fresh Narrow Fix
    If the stale branch is too costly to refresh, open a new narrow fix linked to Status/session context window can over-report the selected model's actual window #39857 and leave broader precedence work to Use minimum of agent config, model config, and model discovery for contextTokens/maxTokens #71136.
  3. Pause Resolver-Only Scope
    Maintainers could intentionally land only the shared resolver clamp, but that should be treated as partial and should not close the claimed session/status bug.

Next step before merge
This needs maintainer/contributor handling because the external proof gate is unmet and the conflicting branch needs a current-main rebase or replacement, which automation cannot prove from the contributor's setup.

Security
Cleared: No concrete security or supply-chain issue was found in the live PR file list, which is limited to TypeScript source/tests and a test timeout change.

Review findings

  • [P2] Reuse current provider-aware context lookup before clamping — src/agents/context.ts:272-284
  • [P2] Clamp the session and status paths too — src/agents/context.ts:288-289
Review details

Best possible solution:

Refresh or replace this with a current-main fix that reuses the existing provider-aware lookup, clamps only when a known model window exists, updates session/status duplicate paths, and includes focused tests plus redacted real behavior proof.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level. Current main still has override-first shared resolver, session-store, and status-summary paths, so a configured override above a known model window can still be persisted or reported before the smaller model limit is considered.

Is this the best way to solve the issue?

No. The min-cap direction is right, but this stale diff is not the best current solution until it is rebased around the current provider-aware resolver contracts and covers the session/status bypasses it claims to fix.

Label justifications:

  • P2: This is a normal-priority correctness fix for context-window reporting and session metadata with limited blast radius.
  • merge-risk: 🚨 compatibility: The stale diff can conflict with current provider/model lookup contracts and alter configured-model behavior during upgrade.
  • merge-risk: 🚨 session-state: The PR directly affects effective context-window state stored in session metadata and surfaced by status paths.
  • rating: 🧂 unranked krab: Current PR rating is 🧂 unranked krab because proof is 🧂 unranked krab, patch quality is 🧂 unranked krab, and The bug signal is useful, but missing real behavior proof plus stale, incomplete, merge-conflicting code keeps the PR below merge-ready quality.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body and comments list tests/CI only; it still needs redacted after-fix proof such as terminal output, logs, a screenshot, or a recording showing /status or session metadata no longer over-reports the context window. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Full review comments:

  • [P2] Reuse current provider-aware context lookup before clamping — src/agents/context.ts:272-284
    The patch reconstructs modelTokens with a local provider-model scan and direct cache lookups. Current main now routes this through resolveConfiguredProviderContextTokens, which handles contextTokens, provider-level caps, exact-before-alias matching, explicit-provider guards, and cache-load options; rebasing this hunk as-is would drop those contracts and can report the wrong window for slash-containing or provider-specific model IDs. Build the clamp around the current helper/lookup order instead.
    Confidence: 0.88
  • [P2] Clamp the session and status paths too — src/agents/context.ts:288-289
    The return here only clamps the shared resolver result. Current main still has updateSessionStoreAfterAgentRun and statusSummaryRuntime.resolveContextTokensForModel choosing overrides before configured model windows, so the PR's claimed session metadata and /status fix can still over-report unless those bypasses are removed or capped too.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

What I checked:

  • live-pr-state: Live GitHub metadata shows the PR is open, non-draft, merge-conflicting, and currently failing the Real behavior proof check. (90d96154d84c)
  • live-pr-file-list: The live PR file list is limited to TypeScript source/tests; the net diff no longer includes the workflow or secrets-baseline commits visible in the patch-series history. (90d96154d84c)
  • pr-head-local-resolver-scan: The PR head computes modelTokens with a local configured-window scan and cache lookups instead of the richer current-main provider-aware helper. (src/agents/context.ts:272, 90d96154d84c)
  • current-main-provider-aware-contract: Current main has a dedicated configured provider context-token helper that honors per-model contextTokens/contextWindow, provider-level caps, exact-before-alias lookup, and explicit-provider safeguards. (src/agents/context.ts:244, 7f4bd454febf)
  • current-main-shared-resolver-override-first: Current main still returns a positive contextTokensOverride before resolving configured or discovered model windows, so the reported bug remains source-reproducible. (src/agents/context.ts:364, 7f4bd454febf)
  • current-main-session-store-bypass: Session persistence still chooses params.contextTokensOverride before calling the shared resolver when runtime metadata did not already provide context tokens. (src/agents/command/session-store.ts:98, 7f4bd454febf)

Likely related people:

  • steipete: Peter Steinberger authored the runtime model contextTokens cap work in the shared resolver area that a rebase must preserve. (role: recent context-token contributor; confidence: high; commits: 58d2b9dd46cc; files: src/agents/context.ts)
  • Takhoffman: Tak Hoffman added the separate status summary provider context lookup and tests that remain an override-first path for this bug. (role: adjacent status-summary contributor; confidence: medium; commits: 2877a7d8b2f6; files: src/commands/status.summary.runtime.ts, src/commands/status.summary.runtime.test.ts)
  • neeravmakwana: Neerav Makwana added provider-aware context-window lookup in model selection, where current main already clamps agent context tokens to the model window. (role: adjacent model-selection contributor; confidence: medium; commits: 68d854cb9c65; files: src/auto-reply/reply/model-selection.ts, src/auto-reply/reply/model-selection.test.ts)
  • Vignesh Natarajan: Vignesh Natarajan added configured contextWindow override support in the shared resolver/tests, which is central to the cap being discussed. (role: configured context-window contributor; confidence: medium; commits: 150c5815eb81; files: src/agents/context.ts, src/agents/context.test.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 7f4bd454febf.

@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 26, 2026
@clawsweeper clawsweeper Bot added the P2 Normal backlog priority with limited blast radius. label May 16, 2026
@openclaw-barnacle openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 16, 2026
@clawsweeper clawsweeper Bot added impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. and removed impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. labels May 17, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Jun 5, 2026
@clawsweeper

clawsweeper Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper 🐠 reef update

Thanks for the contribution. The source branch was not safely writable by ClawSweeper, so it opened a replacement PR and kept the credit trail visible.

Why replacement: ClawSweeper could not update the source PR branch directly; GitHub did not grant sufficient push rights to the bot for that branch.
Replacement PR: #90889
Why close: this run explicitly closes the superseded source PR after the credited replacement PR is open, so review continues in one place.
This closeout is intentional for this run: the replacement PR is now the active review lane.
The original contribution stays credited in the replacement PR context.
Co-author credit kept:

fish notes: model gpt-5.5, reasoning high; reviewed against 0ef2869.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S stale Marked as stale due to inactivity status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Status/session context window can over-report the selected model's actual window

1 participant