fix: [codex-harness] avoid treating cumulative app-server usage as current context#64669
Conversation
9cdfd51 to
8cf38e3
Compare
Greptile SummaryThis PR fixes a bug in the Codex app-server event projector where Confidence Score: 5/5Safe to merge — targeted bug fix with clear test coverage and no regressions to existing behavior. All three changed files are correct. The fix is precisely scoped to the Codex app-server projection boundary, does not touch generic session accounting, and is backed by three new test cases that directly prove the fix works. No P0 or P1 findings. No files require special attention. Reviews (1): Last reviewed commit: "fix: avoid cumulative codex usage as con..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8cf38e36b7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return normalizeUsage({ | ||
| input: readNumberAlias(record, ["inputTokens", "input_tokens", "input", "promptTokens"]), | ||
| output: readNumberAlias(record, ["outputTokens", "output_tokens", "output"]), | ||
| cacheRead: readNumberAlias(record, [ |
There was a problem hiding this comment.
Preserve cache-aware normalization for current token usage
normalizeCodexTokenUsage pre-extracts fields into { input, output, cacheRead, ... } before calling normalizeUsage, which removes raw keys that the normalizer uses to correct OpenAI-style prompt totals (for example cached-token metadata). In thread/tokenUsage/updated payloads that report input_tokens plus cached-token counts, this can double-count cache reads in prompt/context math, so attemptUsage and downstream status percentages can still be inflated even though this change is meant to fix that class of issue.
Useful? React with 👍 / 👎.
|
Landed via temp rebase onto
Thanks @cyrusaf! |
Summary
tokenUsage.totalas assistant/attempt usage.totalTokensand make/statusshow values like999%.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
tokenUsage.totalas if it were the current turn/current call usage snapshot.tokenUsage.totalfixture and did not assert that cumulative totals must not populate assistant/attempt usage.usage,attemptUsage,lastCallUsage, andpromptTokensas current context-window snapshots, not cumulative thread/billing totals.Regression Test Plan (if applicable)
extensions/codex/src/app-server/event-projector.test.tsUser-visible / Behavior Changes
Codex-backed sessions should no longer persist cumulative app-server token totals as fresh context-window usage. This prevents inflated session context percentages in status output when only cumulative usage is available.
Diagram (if applicable)
Security Impact (required)
No)No)No)No)No)Yes, explain risk + mitigation: N/ARepro + Verification
Environment
Steps
thread/tokenUsage/updatednotification containing a large cumulativetokenUsage.totaland a smaller current/last usage object.attemptUsageand the final assistant messageusage.Expected
Actual
tokenUsage.totalwas projected into assistant/attempt usage.Evidence
Human Verification (required)
What I personally verified:
pnpm test extensions/codex/src/app-server/event-projector.test.tspnpm test extensions/codexpnpm checkReview Conversations
Compatibility / Migration
Yes)No)No)Risks and Mitigations