fix(agents): inject resolved OAuth bearer into boundary-aware embedded streams#73588
Conversation
Greptile SummaryThis PR fixes a 401 Unauthorized error that occurs when using Confidence Score: 5/5Safe to merge — minimal, targeted bug fix with thorough test coverage and no interface changes. The change is small and surgical: a new private helper consolidates duplicated injection logic, and both branches are exercised by both old and new tests. Auth precedence (resolvedApiKey → authStorage → existing options.apiKey) is preserved exactly. The boundary-aware fast path (no auth) now also forwards the run abort signal, which is a safe improvement. No public API, exports, types, or transport internals were changed. No files require special attention. Reviews (1): Last reviewed commit: "fix(agents): inject resolved OAuth beare..." | Re-trigger Greptile |
|
Codex review: keeping this open for maintainer follow-up; there is still a little grit to resolve. Keep open. This PR is member-authored and has the protected Best possible solution: Keep this PR open for explicit maintainer review. The likely best path is to land a narrow What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against ab5c8025c9d0. |
73a3773 to
5367533
Compare
7f3121c to
323917b
Compare
323917b to
b3659c6
Compare
|
Landed as |
…d streams (openclaw#73588) Fixes openclaw#73559. Extracts a shared wrapEmbeddedAgentStreamFn helper and applies it to both provider-owned and boundary-aware fallback paths in resolveEmbeddedAgentStreamFn, forwarding the resolved OAuth bearer (resolvedApiKey → authStorage → options.apiKey) and run abort signal so models routing through openai-codex-responses and other boundary-aware transports stop failing with 401 Missing bearer auth header.
…d streams (openclaw#73588) Fixes openclaw#73559. Extracts a shared wrapEmbeddedAgentStreamFn helper and applies it to both provider-owned and boundary-aware fallback paths in resolveEmbeddedAgentStreamFn, forwarding the resolved OAuth bearer (resolvedApiKey → authStorage → options.apiKey) and run abort signal so models routing through openai-codex-responses and other boundary-aware transports stop failing with 401 Missing bearer auth header.
…d streams (openclaw#73588) Fixes openclaw#73559. Extracts a shared wrapEmbeddedAgentStreamFn helper and applies it to both provider-owned and boundary-aware fallback paths in resolveEmbeddedAgentStreamFn, forwarding the resolved OAuth bearer (resolvedApiKey → authStorage → options.apiKey) and run abort signal so models routing through openai-codex-responses and other boundary-aware transports stop failing with 401 Missing bearer auth header.
…d streams (openclaw#73588) Fixes openclaw#73559. Extracts a shared wrapEmbeddedAgentStreamFn helper and applies it to both provider-owned and boundary-aware fallback paths in resolveEmbeddedAgentStreamFn, forwarding the resolved OAuth bearer (resolvedApiKey → authStorage → options.apiKey) and run abort signal so models routing through openai-codex-responses and other boundary-aware transports stop failing with 401 Missing bearer auth header.
Summary
openai-codex/gpt-5.5causes the embedded run to fail before reply withunexpected status 401 Unauthorized: Missing bearer or basic authentication in header, url: https://api.openai.com/v1/responses. Logs show three matching errors at the lane (lane=main), session-agent (lane=session:agent:main:main) and embedded-agent layers. The resolved Codex OAuth profile is present and validated (openclaw models statusreports it as healthy), but the OpenAI Responses HTTP request goes out with no bearer header. The defect is insrc/agents/pi-embedded-runner/stream-resolution.ts:124(pre-fix), insideresolveEmbeddedAgentStreamFn.resolveEmbeddedAgentStreamFninjects the resolved runtimeapiKey(the OAuth bearer that the embedded run layer already resolved atsrc/agents/pi-embedded-runner/run.ts:867and forwarded asparams.resolvedApiKeyfromsrc/agents/pi-embedded-runner/run/attempt.ts:1611) only on the provider-owned branch. On the boundary-aware fallback branch — taken whenever a model resolves to a transport-aware API likeopenai-codex-responseswithout a registered provider stream — the function simplyreturn boundaryAwareStreamFn;and the resolved key is dropped.createOpenAIResponsesTransportStreamFnatsrc/agents/openai-transport-stream.ts:738then constructs the OpenAI client fromoptions?.apiKey || getEnvApiKey(model.provider) || "". For OAuth-only providers likeopenai-codexthere is no env var,options.apiKeywas never injected, so the SDK is created withapiKey: ""and OpenAI rejects the request with the 401 above.gpt-5.5is the first widely-used Codex model that takes this exactboundary-aware:openai-codex-responseslane (theopenai-codexplugin does not register a provider-owned stream for the Codex Responses API — seeextensions/openai/openai-codex-provider.ts:175), which is why the regression surfaces specifically on theopenai-codex/gpt-5.5upgrade and stays present after re-runningOpenAI Codex OAuth login.resolvedApiKey+ run-signal injection logic from the provider-owned branch into a small private helperwrapEmbeddedAgentStreamFn, and apply the same wrapper to the boundary-aware fallback. The provider-owned branch keeps itsstripSystemPromptCacheBoundarycontext normalization (passed in as the optionaltransformContextcallback). The boundary-aware branch deliberately omits that callback because boundary-aware transports (createOpenAIResponsesTransportStreamFn,createAnthropicMessagesTransportStreamFn, etc.) already strip the cache boundary internally — seesrc/agents/openai-transport-stream.ts:254,:870,:1755— so re-stripping would be a behavior change. Auth precedence inside the helper is preserved exactly:resolvedApiKey?.trim()wins, thenauthStorage.getApiKey(provider), then any pre-existingoptions.apiKey. This is the minimal change that closes the credential gap on every boundary-aware transport (Codex Responses, OpenAI Responses, OpenAI Completions, Azure OpenAI Responses, Anthropic Messages, Google Generative AI) without touching provider plugins, transport internals, OAuth refresh, or the Responses URL.src/agents/pi-embedded-runner/stream-resolution.ts: refactoredresolveEmbeddedAgentStreamFnto delegate to a new privatewrapEmbeddedAgentStreamFnhelper; the boundary-aware fallback now wraps the resolved transport with the same auth/signal injection that the provider-owned branch already had.src/agents/pi-embedded-runner/stream-resolution.test.ts: added four focused regression tests covering resolved-key injection,authStoragefallback, run-signal forwarding, and cache-boundary preservation on the boundary-aware fallback path. Existing tests (provider-owned auth/signal, fallback shape labels) are unchanged.extensions/openai/openai-codex-provider.ts, the OpenAI Codex OAuth login flow, model registry, orauth-profiles.jsonstorage.createOpenAIResponsesTransportStreamFnor any of the boundary-aware transports — theapiKey || getEnvApiKey(model.provider) || ""chain remains exactly as it was.any, no widened types —wrapEmbeddedAgentStreamFnis a file-local helper.Reproduction
main.openclaw models set openai-codex/gpt-5.5openclaw models status— confirms the OAuth profile resolves cleanly.openclaw gateway --port 18789main):The new unit test
injects the resolved run api key into the boundary-aware Codex Responses fallbackreproduces the credential drop deterministically without any network or live OAuth.Risk / Mitigation
openai-responses,openai-codex-responses,openai-completions,azure-openai-responses,anthropic-messages,google-generative-ai). For non-OAuth providers that previously relied ongetEnvApiKey(model.provider)inside the transport, an emptyparams.resolvedApiKeyand missingparams.authStoragewould now still produce nooptions.apiKey— exactly as before. For env-keyed providers that do have anauthStorageentry,authStorage.getApiKey(provider)returns the same key the transport would have read from env, so behavior is preserved. The auth precedence (resolvedApiKey→authStorage→ existingoptions.apiKey) is bit-identical to the provider-owned branch that has been in production sinced12987d72.d12987d72. Cancellation reaches the OpenAI SDKAbortSignalslot the same way; the existing testdoes not overwrite an explicit provider-owned stream signalproves explicit caller signals still win, and the new boundary-aware sibling test asserts the same precedence.stream-resolution.test.tslock the contract: resolved-key injection,authStoragefallback, run-signal forwarding, and cache-boundary preservation on the boundary-aware path. The cache-boundary test directly proves the fix did not silently start re-stripping<<openclaw-cache-boundary>>markers (which would corrupt prompt-cache hit rates on Codex Responses). Existing provider-owned coverage is untouched and continues to assert the prior behavior end-to-end.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Fixes #73559