You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(voice-call): stabilize Twilio STT startup (#75257)
Fix Twilio voice-call startup so accepted media streams register immediately, realtime transcription readiness gates only the initial greeting, and early inbound media is preserved while STT connects.
Fixes#75197.
Thanks @PfanP and @donkeykong91.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,7 @@ Docs: https://docs.openclaw.ai
13
13
14
14
### Fixes
15
15
16
+
- Voice Call/Twilio: register accepted media streams immediately but wait for realtime transcription readiness before speaking the initial greeting, so reconnect grace handling stays live while OpenAI STT startup is no longer starved by TTS. Fixes #75197. (#75257) Thanks @donkeykong91 and @PfanP.
16
17
- Agents/pi-embedded-runner: extract the `abortable` provider-call wrapper from `runEmbeddedAttempt` to module scope so its promise handlers no longer close over the run lexical context, releasing transcripts, tool buffers, and subscription callbacks when a provider call hangs past abort. (#74182) Thanks @cjboy007.
17
18
- Docker: restore `python3` in the gateway runtime image after the slim-runtime switch. Fixes #75041.
18
19
- CLI/Voice Call: scope `voicecall` command activation to the Voice Call plugin so setup and smoke checks no longer broad-load unrelated plugin runtimes or hang after printing JSON. Thanks @vincentkoc.
Copy file name to clipboardExpand all lines: docs/plugins/voice-call.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -297,6 +297,7 @@ Current runtime behavior:
297
297
-`streaming.provider` is optional. If unset, Voice Call uses the first registered realtime transcription provider.
298
298
- Bundled realtime transcription providers: Deepgram (`deepgram`), ElevenLabs (`elevenlabs`), Mistral (`mistral`), OpenAI (`openai`), and xAI (`xai`), registered by their provider plugins.
299
299
- Provider-owned raw config lives under `streaming.providers.<providerId>`.
300
+
- After Twilio sends an accepted stream `start` message, Voice Call registers the stream immediately, queues inbound media through the transcription provider while the provider connects, and starts the initial greeting only after realtime transcription is ready.
300
301
- If `streaming.provider` points at an unregistered provider, or none is registered, Voice Call logs a warning and skips media streaming instead of failing the whole plugin.
0 commit comments