feat(codex): surface pre-turn projection accounting (#80765)#80778
feat(codex): surface pre-turn projection accounting (#80765)#80778aiZKP wants to merge 1 commit into
Conversation
Adds a `stats` block to the Codex context-engine projection so callers can distinguish LCM/frontier sizing from the rendered Codex prompt and from post-turn provider-observed usage. The block carries `projectedPromptChars`, `promptTokens`, an `accounting: "estimated" | "exact"` marker, the active `capChars`, and (when routed through) the configured compaction `reserveTokens` knob. The projection accepts an optional `tokenize` callback so a provider/runtime tokenizer can flip stats to `exact` when available; without one the existing 4-chars/token heuristic is used and accounting is explicitly marked `estimated`. The Codex app-server run-attempt now resolves `agents.defaults.compaction.reserveTokens` (falling back to `reserveTokensFloor`) and emits a `codex_app_server.context_projection` telemetry event alongside the existing post-turn usage signals. Closes openclaw#80765
|
Codex review: needs real behavior proof before merge. Reviewed May 27, 2026, 12:59 AM ET / 04:59 UTC. Summary PR surface: Source +138, Tests +78. Total +216 across 3 files. Reproducibility: yes. from source, but not from a live run: current main returns no projection stats and has no Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance: Risk before merge
Maintainer options:
Next step before merge Security Review findings
Review detailsBest possible solution: Rebase onto current main, layer stats and the telemetry event onto the existing budget-aware projection/reserve helpers, and add redacted real Codex app-server output showing the emitted event values. Do we have a high-confidence way to reproduce the issue? Yes from source, but not from a live run: current main returns no projection stats and has no Is this the best way to solve the issue? No as currently submitted. The useful approach is to add the accounting seam, but it must be rebased onto the current budget-aware projection path and report the actual active cap/reserve values. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 44c1cc8285c8. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +138, Tests +78. Total +216 across 3 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
This pull request has been automatically marked as stale due to inactivity. |
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
|
@aiZKP thanks for the PR. ClawSweeper is still waiting on real behavior proof before this can move forward. Useful proof can be a screenshot, short video, terminal output, copied live output, linked artifact, or redacted logs that show the changed behavior after the fix. Please redact private tokens, phone numbers, private endpoints, customer data, and anything else sensitive. Once proof is added to the PR body or a comment, ClawSweeper or a maintainer can re-check it. |
|
@aiZKP thanks for the PR. ClawSweeper is still waiting on real behavior proof before this can move forward. Useful proof can be a screenshot, short video, terminal output, copied live output, linked artifact, or redacted logs that show the changed behavior after the fix. Please redact private tokens, phone numbers, private endpoints, customer data, and anything else sensitive. Once proof is added to the PR body or a comment, ClawSweeper or a maintainer can re-check it. |
Summary
Closes #80765.
Codex's context-engine projection previously sized the rendered prompt with the
generic
4 chars/tokenheuristic and exposed nothing about that estimatedownstream. Status/LCM diagnostics could not separate frontier tokens
selected by the context engine, rendered Codex projection chars/tokens
before send, and provider-observed usage after the turn.
This PR adds a small pre-turn accounting snapshot to the projection and routes
it into agent telemetry:
projectContextEngineAssemblyForCodexnow returns astatsblock:projectedPromptChars— length of the rendered Codex promptpromptTokens— tokenizer-backed when supplied, heuristic otherwiseaccounting: "estimated" | "exact"— explicit markercapChars— active rendered-context cap (currently24_000)reserveTokens— surfaced when the caller routes the configuredagents.defaults.compaction.reserveTokens/reserveTokensFloorthroughtokenize?: (text: string) => number | undefinedparameterlets a future Codex app-server / provider tokenizer flip the marker to
exactwithout changing call sites. Throwing or non-finite returns fallback to the heuristic.
run-attempt.tsresolvesagents.defaults.compaction.reserveTokens(falling back to
reserveTokensFloor) and emits a newcodex_app_server.context_projectionagent event beforeturn/starton both the context-engine and mirrored-history projection paths.
Existing behavior (24k char cap, prompt rendering, duplicate trailing-prompt
trim, developer-instruction addition,
prePromptMessageCount) is unchanged.Acceptance criteria
is supplied; otherwise marks accounting as
estimated.(
frontierTokenson the emitted event, equal tocontextTokenBudget)(
projectedPromptChars/promptTokens/accounting)(existing
afterTurnruntimeContext.lastCallUsage/promptCache)fields surface through projection stats.
Files touched
extensions/codex/src/app-server/context-engine-projection.tsextensions/codex/src/app-server/run-attempt.tsextensions/codex/src/app-server/context-engine-projection.test.tsNo SDK contract, no public manifest, no docs/changelog surface changed.
Test plan
pnpm test extensions/codex/src/app-server/context-engine-projection.test.ts— 10 passed (5 new + 5 existing)pnpm test extensions/codex/src/app-server/run-attempt.context-engine.test.ts— 6 passedpnpm check:changed(extension prod + extension test lanes) — typecheck, oxlint, format, runtime sidecar guard, import-cycle check all greenrun-attempt.test.ts > does not expose OpenClaw Tool Search controls through Codex dynamic tools) times out — verified to fail identically onmainwithout these changes, so it is pre-existing and unrelated to this PR.Notes for reviewers
MAX_RENDERED_CONTEXT_CHARS = 24_000) is intentionallyunchanged here. Making it budget-aware via
contextTokenBudget/reserveTokensis tracked by fix(codex): scale context engine projection #80761; this PR is the accountingfollow-up only.
tokenizeparameter is a no-op until a Codex/provider tokenizeris wired in. The acceptance criterion ("exact when the runtime/tokenizer
surface supports it") is satisfied by the seam plus the explicit
estimatedmarker; no behavior change for current callers.Record<string, unknown>— consumers that alreadysubscribe to
onAgentEventsee a newstreamvalue but the existingenvelope shape is preserved.
Refs: #80765