Improve Gemini realtime voice parity for Twilio Meet joins#77064
Improve Gemini realtime voice parity for Twilio Meet joins#77064steipete merged 6 commits intoopenclaw:mainfrom
Conversation
|
Codex review: needs changes before merge. Summary Reproducibility: no. high-confidence live reproduction was established. Source inspection does show current main directly sends Twilio realtime media/clear/mark messages and lacks the Gemini resumption/compression defaults that the PR addresses. Next step before merge Security Review findings
Review detailsBest possible solution: Add a focused changelog entry, keep the runtime changes within the voice-call and Google provider plugins, then land after maintainer review and preferably a live Twilio Google Meet smoke. Do we have a high-confidence way to reproduce the issue? No high-confidence live reproduction was established. Source inspection does show current main directly sends Twilio realtime media/clear/mark messages and lacks the Gemini resumption/compression defaults that the PR addresses. Is this the best way to solve the issue? Yes, the proposed direction is the narrow maintainable split: Twilio transport pacing stays in the voice-call plugin, and Gemini Live controls stay in the Google provider. It is not merge-ready until the changelog entry is added and maintainer/live validation accepts the behavior. Full review comments:
Overall correctness: patch is correct Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 2949171fcc15. |
0b95bd3 to
d339439
Compare
|
Landed via rebase onto
Thanks @scoootscooob! |
What bug / behavior this fixes
Google Meet joins over Twilio were noticeably laggier than the Paradigm-style Gemini Live path. The main gaps were outbound model audio being dumped into Twilio faster than telephony playback, barge-in waiting on provider interruption instead of clearing local queued audio immediately, and Gemini Live sessions not using the newer resumption/compression controls by default.
What changed
voicecall.startconversation mode for GMeet dial-in flows.Evidence
pnpm test extensions/google/realtime-voice-provider.test.tspnpm test extensions/voice-call/src/webhook/realtime-audio-pacer.test.ts extensions/voice-call/src/webhook/realtime-handler.test.ts extensions/voice-call/index.test.tspnpm check:changedreached extension typecheck and extension test typecheck successfully, then failed inlint:extensionson an unrelated existing lint finding inextensions/qa-lab/src/mantis/slack-desktop-smoke.runtime.test.ts:105; that file is not in this branch diff versusorigin/main.Notes