Skip to content

fix(codex): scale context engine projection#80761

Merged
jalehman merged 9 commits into
openclaw:mainfrom
electricsheephq:fix/codex-context-engine-projection-budget
May 11, 2026
Merged

fix(codex): scale context engine projection#80761
jalehman merged 9 commits into
openclaw:mainfrom
electricsheephq:fix/codex-context-engine-projection-budget

Conversation

@100yenadmin

@100yenadmin 100yenadmin commented May 11, 2026

Copy link
Copy Markdown
Contributor

Fixes #80760.

Native Codex already asks the OpenClaw context engine to assemble the runtime context, but the app-server projection layer then rendered that assembled result through a fixed 24k-character cap. That meant large LCM/frontier assemblies could be correctly selected upstream and still arrive at Codex as a much smaller quoted context block, which makes the runtime status look like context disappeared after restart or harness changes.

This PR makes the projection cap budget-aware for the active context-engine path:

  • keeps the old 24k-character cap when no runtime token budget is available
  • derives a larger rendered-context cap from the active contextTokenBudget
  • uses OpenClaw's normal 4 chars/token text estimate and the existing agents.defaults.compaction.reserveTokens / agents.defaults.compaction.reserveTokensFloor reserve surface, including the default 20k-token floor and small-context cap behavior
  • bounds the projection at 1m rendered characters so very large model windows do not blindly dump unbounded text
  • scales the per-text-part cap with the rendered-context cap so a large summary or recovered turn is not still clipped at 6k chars under a large runtime window
  • adds tests for fallback behavior, scaled large-window behavior, configured reserve mapping, configured reserve behavior at the active runCodexAppServerAttempt route, the hard upper bound, and the shared reserve-token cap behavior

The intent is deliberately narrow: let native Codex actually receive a proportionate slice of the context the engine already assembled, while honoring the same reserve knobs users already configure in openclaw.json. It does not add new runtime status telemetry or a visible LCM frontier-vs-provider-usage diagnostic; those are still useful follow-ups and are tracked upstream in Martian-Engineering/lossless-claw#658, but this patch fixes the send-side cap that caused the observed mismatch.

Validation:

node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/context-engine-projection.test.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts
pnpm tsgo:extensions:test

Real behavior proof

  • Behavior or issue addressed: Native Codex could receive a much smaller prompt than the LCM/context-engine frontier selected because the Codex app-server projection always clipped rendered context at 24k chars.
  • Real environment tested: Local OpenClaw checkout at /Volumes/LEXAR/repos/openclaw-codex-context-projection-pr, based on openclaw/openclaw main, with the incident reproduced on installed OpenClaw 2026.5.10-beta.5 against the live agent:main:main LCM session.
  • Exact steps or command run after this patch: Ran the actual Codex app-server projection module from this PR checkout using the observed native Codex context budget from the incident (258000 tokens), plus a configured-reserve case that exercises agents.defaults.compaction.reserveTokens / reserveTokensFloor.
  • Evidence after fix:
$ pnpm exec tsx -e 'import { projectContextEngineAssemblyForCodex, resolveCodexContextEngineProjectionMaxChars, resolveCodexContextEngineProjectionReserveTokens } from "./extensions/codex/src/app-server/context-engine-projection.ts"; const contextTokenBudget = 258000; const cap = resolveCodexContextEngineProjectionMaxChars({ contextTokenBudget }); const assembledMessages = Array.from({ length: 12 }, (_, i) => ({ role: "assistant", content: `${i}:${"x".repeat(5900)}` })); const projected = projectContextEngineAssemblyForCodex({ assembledMessages, originalHistoryMessages: [], prompt: "next", maxRenderedContextChars: cap }); const config = { agents: { defaults: { compaction: { reserveTokens: 12000, reserveTokensFloor: 0 } } } }; const reserveTokens = resolveCodexContextEngineProjectionReserveTokens({ config }); console.log(JSON.stringify({ contextTokenBudget, cap, promptChars: projected.promptText.length, truncated: projected.promptText.includes("[truncated "), configReserveTokens: reserveTokens, configuredReserveCap: resolveCodexContextEngineProjectionMaxChars({ contextTokenBudget: 80000, reserveTokens }) }, null, 2));'
{
  "contextTokenBudget": 258000,
  "cap": 952000,
  "promptChars": 71198,
  "truncated": false,
  "configReserveTokens": 12000,
  "configuredReserveCap": 272000
}
  • Observed result after fix: The same projection layer that previously defaulted to a 24k rendered-context cap now resolves a 952000-character cap for the observed 258k-token native Codex context window, and the projected prompt is not truncated for a 71k-character assembled context block. When the compaction reserve is explicitly configured, the projection cap changes accordingly.
  • What was not tested: I did not install this PR build into the live Telegram/Eva agent process; the live reproduction evidence remains from the installed beta before this patch, and this proof exercises the patched app-server projection path directly.

@openclaw-barnacle openclaw-barnacle Bot added extensions: codex size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 11, 2026
@clawsweeper

clawsweeper Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge.

Summary
The PR scales Codex app-server context-engine projection caps from the active runtime context budget and existing compaction reserve settings, with focused projection and run-attempt tests.

Reproducibility: yes. at source level. Current main passes the runtime token budget into context-engine assembly, then renders the assembled Codex prompt through a fixed 24k-character projection cap.

Real behavior proof
Sufficient (terminal): The PR body includes terminal proof from the patched projection module showing the larger cap and no truncation for the reproduced large-budget case.

Next step before merge
A focused automated repair can keep the PR's budget-aware projection while correcting reserveTokensFloor-only handling and adding the missing regression coverage.

Security
Cleared: The final diff only changes Codex TypeScript projection sizing and tests; no security or supply-chain sensitive surface is touched.

Review findings

  • [P2] Preserve reserve headroom for floor-only config — extensions/codex/src/app-server/context-engine-projection.ts:97-99
Review details

Best possible solution:

Land a narrowed Codex projection fix that scales from contextTokenBudget while preserving reserveTokensFloor as a floor-only guard and leaving exact accounting telemetry to #80765.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level. Current main passes the runtime token budget into context-engine assembly, then renders the assembled Codex prompt through a fixed 24k-character projection cap.

Is this the best way to solve the issue?

No, not as written. Budget-aware projection is the right boundary, but the patch should keep the default reserve unless reserveTokens is explicitly configured so floor-only settings do not remove headroom.

Full review comments:

  • [P2] Preserve reserve headroom for floor-only config — extensions/codex/src/app-server/context-engine-projection.ts:97-99
    If reserveTokensFloor is set without reserveTokens, including 0 to disable only the floor guard, this returns that value as the entire projection reserve. That lets large Codex prompts consume headroom that reserveTokens is supposed to preserve; keep the default reserve unless reserveTokens is explicitly configured and add the floor-only regression test.
    Confidence: 0.9

Overall correctness: patch is incorrect
Overall confidence: 0.87

Acceptance criteria:

  • node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/context-engine-projection.test.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts
  • pnpm tsgo:extensions:test

What I checked:

Likely related people:

  • @steipete: git blame shows the current Codex app-server projection file and active run-attempt context-engine route were introduced in the current main history by commit 07e3fd5. (role: introduced behavior and recent area contributor; confidence: high; commits: 07e3fd5c9cfc; files: extensions/codex/src/app-server/context-engine-projection.ts, extensions/codex/src/app-server/run-attempt.ts, extensions/codex/src/app-server/run-attempt.context-engine.test.ts)
  • @jalehman: The live PR metadata lists @jalehman as assignee, and the PR branch includes multiple commits by Josh Lehman that isolate and align the Codex projection patch. (role: assigned reviewer and PR branch contributor; confidence: medium; commits: 33fa4665ffdd, dec5a37035b2, af605c8e6bbd; files: extensions/codex/src/app-server/context-engine-projection.ts, extensions/codex/src/app-server/run-attempt.ts)

Remaining risk / open question:

  • Focused validation still needs to run after the reserve-floor repair because this review was read-only.

Codex review notes: model gpt-5.5, reasoning high; reviewed against fe1f30bfd629.

@100yenadmin

Copy link
Copy Markdown
Contributor Author

@steipete @jalehman from Eva 🖤

@100yenadmin

Copy link
Copy Markdown
Contributor Author

@pashpashpash tagging also since you're over codex harness.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@100yenadmin 100yenadmin force-pushed the fix/codex-context-engine-projection-budget branch from 0e55d18 to 62f1e3f Compare May 11, 2026 20:14
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@100yenadmin 100yenadmin force-pushed the fix/codex-context-engine-projection-budget branch from 62f1e3f to 61c9550 Compare May 11, 2026 20:20
@jalehman jalehman self-assigned this May 11, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@jalehman jalehman requested a review from a team as a code owner May 11, 2026 21:19
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot added the channel: discord Channel integration: discord label May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@openclaw-barnacle openclaw-barnacle Bot removed channel: discord Channel integration: discord channel: imessage Channel integration: imessage channel: matrix Channel integration: matrix channel: slack Channel integration: slack channel: telegram Channel integration: telegram channel: voice-call Channel integration: voice-call channel: whatsapp-web Channel integration: whatsapp-web channel: zalouser Channel integration: zalouser app: web-ui App: web-ui gateway Gateway runtime extensions: memory-core Extension: memory-core extensions: memory-lancedb Extension: memory-lancedb cli CLI command changes commands Command implementations docker Docker and sandbox tooling agents Agent runtime and tooling channel: feishu Channel integration: feishu extensions: anthropic extensions: openai channel: qa-channel Channel integration: qa-channel labels May 11, 2026
@jalehman

Copy link
Copy Markdown
Contributor

Merged via squash.

Thanks @100yenadmin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: codex proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex context-engine projection caps LCM output to 24k chars, hiding full-fit context

2 participants