fix(codex): classify app-server auth refresh failures#81638
Conversation
|
Codex review: needs maintainer review before merge. Summary Reproducibility: Do we have a high-confidence way to reproduce the issue? Partially: the PR adds source-level regression cases for the known Codex app-server JSON-RPC error shapes, but no live expired/logged-out Codex production account reproduction was performed. Real behavior proof Next step before merge Security Review detailsBest possible solution: Let maintainers finish normal review for the draft PR and land it only if they accept the Codex app-server auth-refresh classification behavior. Do we have a high-confidence way to reproduce the issue? Do we have a high-confidence way to reproduce the issue? Partially: the PR adds source-level regression cases for the known Codex app-server JSON-RPC error shapes, but no live expired/logged-out Codex production account reproduction was performed. Is this the best way to solve the issue? Is this the best way to solve the issue? Yes, based on the provided diff: it reuses the existing OpenClaw auth-refresh classification and only extracts actionable Codex relogin detail from the app-server error payload instead of adding a new user-facing mode. Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 256377c029f6. Re-review progress:
|
0d70c0b to
b78fd75
Compare
|
Verification before merge: Behavior addressed: Codex native/app-server token-refresh logout and relogin failures now classify as authentication refresh failures instead of surfacing as raw runtime errors. |
Summary
error.data.detailon JSON-RPC errors before the genericfailed to load configurationtext hides itRoot Cause
OpenAI Codex exposes this failure in more than one shape. Native managed auth can surface the raw
Your access token could not be refreshed...message, while app-server startup/config paths can put the actionable relogin text in JSON-RPCerror.data.detail. OpenClaw only classified the wrapped OAuth refresh path and mostly formatted app-server RPC errors fromerror.message, so users could see a generic runtime/config failure instead of re-auth guidance.Verification
Behavior addressed: Codex app-server auth refresh/logout/relogin failures now classify as
auth_refreshand format as re-authentication guidance.Real environment tested: Blacksmith Testbox
tbx_01krj782ckry3qkyd747va5n1x(quick-krill) plus local targeted Vitest throughscripts/test-projects.mjs.Exact steps or command run after this patch:
node scripts/test-projects.mjs extensions/codex/src/app-server/client.test.ts src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts -- --reporter=dot;git diff --check origin/main...HEAD;node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox --blacksmith-org openclaw --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job check --blacksmith-ref main --idle-timeout 90m --ttl 240m --timing-json --shell -- "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 OPENCLAW_TESTBOX=1 OPENCLAW_TESTBOX_REMOTE_RUN=1 pnpm check:changed".Evidence after fix: targeted tests passed,
git diff --checkpassed, and Testboxpnpm check:changedcompleted with exit0in3m25.921s; backing run: https://github.com/openclaw/openclaw/actions/runs/25839278713.Observed result after fix: exact Codex logout/account-switch and app-server relogin-detail payloads produce the existing
Authentication refresh failed. Re-authenticate this provider and try again.user-facing path instead of falling through as raw runtime/config errors.What was not tested: a live expired/logged-out Codex account against OpenAI production, because reproducing that safely would require mutating a real auth session. The regression is based on current OpenAI Codex source behavior and app-server protocol/error shapes.