Skip to content

fix(auth): harden codex auth probes#75272

Closed
ecochran76 wants to merge 3 commits into
openclaw:mainfrom
ecochran76:fix/codex-auth-probes
Closed

fix(auth): harden codex auth probes#75272
ecochran76 wants to merge 3 commits into
openclaw:mainfrom
ecochran76:fix/codex-auth-probes

Conversation

@ecochran76

Copy link
Copy Markdown

Summary

  • Problem: models status --probe can report false auth/format failures for OpenAI Codex OAuth profiles even when the selected profile is valid.
  • Why it matters: the probe path is the operator-facing way to confirm whether a profile needs re-auth; false failures can send users into unnecessary login loops.
  • What changed: run auth probes through the PI harness in raw model-run mode, forward resolved profile credentials through the embedded agent API-key seam, keep Codex Responses on the boundary-aware transport even when a session custom stream exists, and provide required fallback instructions for native Codex Responses requests.
  • What did NOT change (scope boundary): no auth storage format changes, no profile-order changes, no external CLI credential discovery changes, and no new provider/network surface.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: the auth probe path used a non-raw embedded run shape that could retain session/custom transport behavior and did not reliably expose the resolved auth profile credential through the PI agent API-key seam.
  • Missing detection / guardrail: existing tests covered status/probe mapping and default boundary-aware fallbacks, but not the selected-profile probe call shape or Codex Responses with custom session streams.
  • Contributing context (if known): native ChatGPT Codex Responses requests require a non-empty instructions field, and probes can have an intentionally empty system prompt.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/commands/models/list.probe.test.ts, src/agents/pi-embedded-runner/stream-resolution.test.ts, src/agents/openai-transport-stream.test.ts
  • Scenario the test should lock in: selected-profile OpenAI Codex probes use the PI harness and raw model-run mode, Codex Responses keeps the boundary-aware stream over custom session streams, and native Codex Responses probes always send required instructions.
  • Why this is the smallest reliable guardrail: it exercises the call-shape and transport seams without a live provider dependency.
  • Existing test that already covers this (if any): N/A
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

openclaw models status --probe --probe-provider openai-codex --probe-profile <profile> should be less likely to return false auth/format failures for valid OpenAI Codex OAuth profiles.

Diagram (if applicable)

Before:
models status --probe -> embedded run with session/custom transport -> missing or malformed Codex request -> false probe failure

After:
models status --probe -> PI raw model run -> resolved profile credential + boundary-aware Codex transport -> accurate probe result

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) Yes
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: resolved credentials are forwarded only to the matching run provider via the existing API-key seam; no credential material is logged or exposed in tests.

Repro + Verification

Environment

  • OS: Linux / WSL2
  • Runtime/container: Node 24, pnpm workspace
  • Model/provider: openai-codex/gpt-5.5
  • Integration/channel (if any): CLI auth probe
  • Relevant config (redacted): user-scoped auth profile store with an OpenAI Codex OAuth profile

Steps

  1. Configure an OpenAI Codex OAuth profile.
  2. Run openclaw models status --agent graphiti-agent --probe --probe-provider openai-codex --probe-profile openai-codex:soylei --probe-timeout 90000 --probe-max-tokens 16 --json.
  3. Inspect the probe result.

Expected

  • A valid profile reports status: ok.

Actual

  • Before this fix, the same valid profile could report auth/format failure because the probe run did not consistently reach the Codex Responses transport with the resolved profile credential and native-compatible payload shape.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: targeted unit/seam tests for Codex transport params, stream-resolution routing, and probe run call shape.
  • Edge cases checked: native ChatGPT Codex backend vs custom Codex-compatible proxy behavior; custom session stream preservation for non-Codex providers.
  • What you did not verify: live provider probe on this exact upstream-main worktree, because local user config contains downstream-only keys rejected by upstream main; the same fix was live-verified on the downstream runtime before extracting this minimal upstream patch.

Commands run:

  • pnpm docs:list
  • pnpm install
  • pnpm exec oxfmt --check --threads=1 src/agents/openai-transport-stream.ts src/agents/openai-transport-stream.test.ts src/agents/pi-embedded-runner/run/attempt.ts src/agents/pi-embedded-runner/stream-resolution.ts src/agents/pi-embedded-runner/stream-resolution.test.ts src/commands/models/list.probe.ts src/commands/models/list.probe.test.ts
  • pnpm test src/agents/openai-transport-stream.test.ts src/agents/pi-embedded-runner/stream-resolution.test.ts src/commands/models/list.probe.test.ts
  • pnpm check:changed --staged

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: forcing Codex Responses over a custom session stream could surprise a caller that expected a generic custom stream for Codex.
    • Mitigation: the override is limited to openai-codex-responses, which needs provider-specific credential/payload handling; non-Codex custom session streams keep existing behavior.

@openclaw-barnacle openclaw-barnacle Bot added commands Command implementations agents Agent runtime and tooling size: S labels Apr 30, 2026
@clawsweeper

clawsweeper Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

This PR should close because its central Codex auth-probe fix is superseded by a narrower merged implementation on current main, while this branch remains dirty and carries broader auth/session changes without upstream real behavior proof.

Canonical path: Keep the merged current-main implementation from #87559 as the canonical fix, and open a fresh narrow follow-up only if maintainers want a separately approved change to Codex custom stream ownership.

So I’m closing this here and keeping the remaining discussion on #87559.

Review details

Best possible solution:

Keep the merged current-main implementation from #87559 as the canonical fix, and open a fresh narrow follow-up only if maintainers want a separately approved change to Codex custom stream ownership.

Do we have a high-confidence way to reproduce the issue?

No, not as a current-main failure: current main already contains the merged auth-probe fix. The original bug is source-reproducible from the PR and replacement evidence, but I did not run a live OAuth probe in this read-only review.

Is this the best way to solve the issue?

No. The merged replacement is the better solution because it fixes the selected-profile raw probe and native instructions fallback without this branch's broader custom stream override and session auth hook mutation.

Security review:

Security review needs attention: The branch changes token handling and introduces a concrete stale-credential/session-auth risk that the merged replacement avoids.

  • [medium] Resolved token can persist on the session — src/agents/pi-embedded-runner/run/attempt.ts:1801
    The PR mutates activeSession.agent.getApiKey with a selected resolved token and does not show a restore path, so a later run may reuse credential material outside the intended attempt scope.
    Confidence: 0.77

AGENTS.md: found and applied where relevant.

What I checked:

  • Current main raw Codex probe path: Current main pins OpenAI Codex auth probes to the built-in OpenClaw runtime and model-run mode by passing agentHarnessId, agentHarnessRuntimeOverride, authProfileId, and modelRun into runEmbeddedAgent. (src/commands/models/list.probe.ts:525, b352cb2d8e7f)
  • Current main probe regression coverage: The current test suite asserts the selected openai-codex profile is forwarded with the OpenClaw runtime override and modelRun=true. (src/commands/models/list.probe.test.ts:94, b352cb2d8e7f)
  • Current main native Codex instructions fallback: Current main resolves empty native Codex Responses instructions to the default instruction while leaving non-native/custom Codex-compatible backends able to omit the fallback. (src/agents/openai-transport-stream.ts:1997, b352cb2d8e7f)
  • Merged replacement provenance: The related replacement PR fix(auth): harden Codex auth probes #87559 merged as commit 37c5003 and changed the same probe/transport test surface without the extra branch churn. (37c5003ed966)
  • Current-main-only release status: The latest release commit v2026.5.27 predates the merged replacement, so the superseding fix is on current main and not proven shipped in that release. (27ae826f6525)
  • Superseded branch risk: The submitted branch also mutates activeSession.agent.getApiKey for resolved credentials without a visible restore, which is broader than the merged replacement path and not needed for the current-main fix. (src/agents/pi-embedded-runner/run/attempt.ts:1801, 25dc2a3fc7e1)

Likely related people:

  • nxmxbbd: Auth-probe PR fix(auth): harden Codex auth probes #87559 merged the narrower current-main implementation and supplied real CLI probe proof for the same central problem. (role: merged replacement author; confidence: high; commits: 37c5003ed966; files: src/commands/models/list.probe.ts, src/commands/models/list.probe.test.ts, src/agents/openai-transport-stream.ts)
  • vincentkoc: Current-main blame for the relevant auth probe, Codex instructions, and embedded stream-resolution lines points to the recent deb48a9 refactor carrying this surface in the checked-out tree. (role: recent area contributor; confidence: medium; commits: deb48a96fb8d; files: src/commands/models/list.probe.ts, src/agents/openai-transport-stream.ts, src/agents/embedded-agent-runner/stream-resolution.ts)
  • steipete: The related ClawSweeper repair context for fix: CLI auth status/probe path regression #75223 ties the CLI auth status/probe regression cluster to commit 581fbea. (role: adjacent regression provenance; confidence: low; commits: 581fbea1d653; files: src/commands/models/list.probe.ts, src/agents/auth-profiles/external-cli-sync.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against b352cb2d8e7f; fix evidence: commit 37c5003ed966, main fix timestamp 2026-05-29T01:58:48+01:00.

@barnacle-openclaw

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@barnacle-openclaw barnacle-openclaw Bot added the stale Marked as stale due to inactivity label May 30, 2026
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels May 30, 2026
@clawsweeper

clawsweeper Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper applied the proposed close for this PR.

@clawsweeper clawsweeper Bot closed this May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling commands Command implementations merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S stale Marked as stale due to inactivity status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant