Skip to content

Commit 9fdd56d

Browse files
authored
fix(openai): require api-key auth for realtime voice (#91567)
* fix(openai): require api-key auth for realtime voice * test(plugin-sdk): avoid auth profile store shadowing
1 parent 4c55dd8 commit 9fdd56d

6 files changed

Lines changed: 256 additions & 272 deletions

File tree

docs/providers/openai.md

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -101,21 +101,19 @@ explicit runtime config.
101101
Control UI Talk with `talk.realtime.provider: "openai"`) goes through the
102102
public **OpenAI Platform Realtime API**, which is billed against OpenAI
103103
Platform credits rather than Codex/ChatGPT subscription quota. An account
104-
with healthy OpenAI OAuth that runs Codex-backed chat models without
105-
issue can still hit `insufficient_quota` / "You exceeded your current
106-
quota" on the first Realtime turn if the same OpenAI organization has no
107-
Platform billing set up.
104+
with healthy OpenAI OAuth that runs Codex-backed chat models without issue
105+
still needs an OpenAI API-key auth profile or a Platform API key with funded
106+
Platform billing for Realtime voice.
108107

109108
Fix: top up Platform credits at
110109
[platform.openai.com/account/billing](https://platform.openai.com/account/billing)
111-
for the organization backing your realtime credentials. Realtime accepts
112-
either a Platform `OPENAI_API_KEY` (configured via `talk.realtime.providers.openai.apiKey`
113-
for Control UI Talk, or `plugins.entries.voice-call.config.realtime.providers.openai.apiKey`
114-
for Voice Call) or an `openai` OAuth profile whose underlying
115-
organization has Platform billing — both routes mint Realtime client secrets
116-
through the Platform API, so either way the org needs funded Platform
117-
credits. For chat turns you can still use Codex-backed `openai/*` models against the same
118-
OpenClaw install; Realtime is the one route that needs Platform billing.
110+
for the organization backing your realtime credentials. Realtime voice accepts
111+
the `openai` API-key auth profile created by `openclaw onboard --auth-choice openai-api-key`,
112+
a Platform `OPENAI_API_KEY` configured via `talk.realtime.providers.openai.apiKey`
113+
for Control UI Talk, `plugins.entries.voice-call.config.realtime.providers.openai.apiKey`
114+
for Voice Call, or the `OPENAI_API_KEY` environment variable. OpenAI OAuth
115+
profiles can still run Codex-backed `openai/*` chat models in the same
116+
OpenClaw install, but they do not configure Realtime voice.
119117
</Note>
120118

121119
## Memory embeddings
@@ -646,7 +644,7 @@ Legacy `plugins.entries.openai.config.personality` is still read as a compatibil
646644
```
647645

648646
<Note>
649-
Set `OPENAI_TTS_BASE_URL` to override the TTS base URL without affecting the chat API endpoint. OpenAI TTS is still configured through an API key; for OAuth-only live talk-back, use the Realtime voice path instead of agent-mode STT -> TTS speech.
647+
Set `OPENAI_TTS_BASE_URL` to override the TTS base URL without affecting the chat API endpoint. OpenAI TTS and Realtime voice are both configured through an OpenAI Platform API key; OAuth-only installs can still use Codex-backed chat models, but not OpenAI live talk-back.
650648
</Note>
651649

652650
</Accordion>
@@ -717,7 +715,7 @@ Legacy `plugins.entries.openai.config.personality` is still read as a compatibil
717715
| Silence duration | `...openai.silenceDurationMs` | `500` |
718716
| Prefix padding | `...openai.prefixPaddingMs` | `300` |
719717
| Reasoning effort | `...openai.reasoningEffort` | (unset) |
720-
| Auth | `...openai.apiKey`, `OPENAI_API_KEY`, or `openai` OAuth | Browser Talk and non-Azure backend bridges can use OpenAI OAuth |
718+
| Auth | `openai` API-key auth profile, `...openai.apiKey`, or `OPENAI_API_KEY` | OpenAI Platform API key required; OpenAI OAuth does not configure Realtime voice |
721719

722720
Available built-in Realtime voices for `gpt-realtime-2`: `alloy`, `ash`,
723721
`ballad`, `coral`, `echo`, `sage`, `shimmer`, `verse`, `marin`, `cedar`.
@@ -739,10 +737,10 @@ Legacy `plugins.entries.openai.config.personality` is still read as a compatibil
739737
<Note>
740738
Control UI Talk uses OpenAI browser realtime sessions with a Gateway-minted
741739
ephemeral client secret and a direct browser WebRTC SDP exchange against the
742-
OpenAI Realtime API. When no direct OpenAI API key is configured, the
743-
Gateway can mint that client secret with the selected `openai` OAuth
744-
profile. Gateway relay and Voice Call backend realtime WebSocket bridges use
745-
the same OAuth fallback for native OpenAI endpoints. Maintainer live
740+
OpenAI Realtime API. The Gateway mints that client secret with the selected
741+
`openai` API-key auth profile or configured OpenAI Platform API key. Gateway
742+
relay and Voice Call backend realtime WebSocket bridges use the same
743+
API-key-only auth path for native OpenAI endpoints. Maintainer live
746744
verification is available with
747745
`OPENAI_API_KEY=... GEMINI_API_KEY=... node --import tsx scripts/dev/realtime-talk-live-smoke.ts`;
748746
the OpenAI legs verify both the backend WebSocket bridge and the browser

docs/web/control-ui.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ Activity entries keep only sanitized summaries and redacted, truncated output pr
201201

202202
</Accordion>
203203
<Accordion title="Talk mode (browser realtime)">
204-
Talk mode uses a registered realtime voice provider. Configure OpenAI with `talk.realtime.provider: "openai"` plus either `talk.realtime.providers.openai.apiKey`, `OPENAI_API_KEY`, or an `openai` OAuth profile; configure Google with `talk.realtime.provider: "google"` plus `talk.realtime.providers.google.apiKey`. For hosted GPT realtime models, OpenClaw prefers the `openai` OAuth profile before `OPENAI_API_KEY`; an explicit OpenAI realtime `apiKey` remains the advanced override. The browser never receives a standard provider API key. OpenAI receives an ephemeral Realtime client secret for WebRTC. Google Live receives a one-use constrained Live API auth token for a browser WebSocket session, with instructions and tool declarations locked into the token by the Gateway. Providers that only expose a backend realtime bridge run through the Gateway relay transport, so credentials and vendor sockets stay server-side while browser audio moves through authenticated Gateway RPCs. The Realtime session prompt is assembled by the Gateway; `talk.client.create` does not accept caller-provided instruction overrides.
204+
Talk mode uses a registered realtime voice provider. Configure OpenAI with `talk.realtime.provider: "openai"` plus an `openai` API-key auth profile, `talk.realtime.providers.openai.apiKey`, or `OPENAI_API_KEY`; OpenAI OAuth profiles do not configure Realtime voice. Configure Google with `talk.realtime.provider: "google"` plus `talk.realtime.providers.google.apiKey`. The browser never receives a standard provider API key. OpenAI receives an ephemeral Realtime client secret for WebRTC. Google Live receives a one-use constrained Live API auth token for a browser WebSocket session, with instructions and tool declarations locked into the token by the Gateway. Providers that only expose a backend realtime bridge run through the Gateway relay transport, so credentials and vendor sockets stay server-side while browser audio moves through authenticated Gateway RPCs. The Realtime session prompt is assembled by the Gateway; `talk.client.create` does not accept caller-provided instruction overrides.
205205

206206
The Chat composer includes a Talk options button next to the Talk start/stop button. The options apply to the next Talk session and can override provider, transport, model, voice, reasoning effort, VAD threshold, silence duration, and prefix padding. When an option is blank, the Gateway uses configured defaults where available or the provider default. Selecting Gateway relay forces the backend relay path; selecting WebRTC keeps the session client-owned and fails instead of silently falling back to relay if the provider cannot create a browser session.
207207

0 commit comments

Comments
 (0)