feat(speech): native audio output via Vertex ADC route (tulgey #247) by matin · Pull Request #17 · matin/openclaw

matin · 2026-06-09T02:32:30Z

What

Makes native audio output the primary voice-note path on the keyless-Vertex deployment, implementing the membrane row of imperfect-co/tulgey#247 / ADR 0024.

Why

The voice-note output relay is broken in prod (tulgey#233, "Media failed") and the Google speech provider had no Vertex ADC path — it only knew the AI-Studio-key route and threw "Google API key missing" on a keyless deployment (tulgey#10).

The useful discovery: extensions/google/speech-provider.ts already emits native generateContent AUDIO (gemini-3.1-flash-tts-preview, responseModalities:['AUDIO'] + speechConfig, Charon in the voice list) and already transcodes PCM → opus-in-ogg for target: "voice-note". The only missing piece was auth. So this is a small, additive route — not a rewrite of the synthesis path.

What changed

Vertex ADC route (synthesizeGoogleVertexTtsPcm) — POSTs to aiplatform.googleapis.com/v1/projects/{project}/locations/{global}/publishers/google/models/{model}:generateContent using resolveGoogleVertexAuthorizedUserHeaders (the same ADC bearer the Google chat / Veo paths already use). Request body, PCM extraction, WAV-wrap, and the opus transcode are shared verbatim with the existing AI-Studio route via a new buildGoogleSpeechGenerateContentBody helper.
Route selection (resolveGoogleTtsPcm) — AI-Studio key route stays primary; fall to the Vertex ADC route when no key but ADC is present; throw with neither, so the speech provider-order fallback (Cloud TTS → text) trips on a detected failure rather than a silent degrade (ADR 0024 §2).
isConfigured is now ADC-aware so the provider is selected on a keyless deployment.

Verification

vitest run extensions/google/speech-provider.test.ts → 19/19 pass (17 existing — the AI-Studio path is byte-for-byte unchanged for a real key — + 2 new asserting the Vertex generateContent URL shape, global + regional).
speech-provider.ts is type-clean. (An isolated tsc -b extensions/google reports 11 errors, all pre-existing sdk-version-skew in sibling files — vertex-adc.ts, transport-stream.ts, realtime-voice-provider.ts, etc. — none in this diff's files; CI's full build-all builds the sdk from source and is authoritative.)

Deploy (follow-up, not in this PR)

Cut /opt/openclaw to this fork build (the Consolidate membrane VM hot-patches onto fork-main (Veo + companions) #5 / tulgey#218 consolidation pattern, survives openclaw update).
Live-verify a WhatsApp voice turn produces native opus audio.
tulgey#234 (native ingestion not engaging) is the symmetric input-side fix needed for the full audio-in → audio-out loop.

Refs imperfect-co/tulgey#247, tulgey#10, tulgey#233.

Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com

Summary by CodeRabbit

Release Notes

New Features
- Added support for Google Vertex Application Default Credentials as an alternative authentication method for text-to-speech.
- Improved authentication handling with automatic fallback between API key and credential-based authentication methods.
Tests
- Expanded test coverage for Google Vertex TTS URL routing.

…aw#247) The Google speech provider already emits native generateContent AUDIO (gemini-3.1-flash-tts-preview, responseModalities:['AUDIO'] + speechConfig) and already transcodes to opus-in-ogg for voice-note delivery. The only gap was auth: it knew the AI-Studio key route only and threw "Google API key missing" on a keyless Vertex deployment (tulgey #10). This adds the Vertex ADC route so native output is the primary path on the deployment. - Add a Vertex ADC synthesis route (synthesizeGoogleVertexTtsPcm) that rides resolveGoogleVertexAuthorizedUserHeaders (the same ADC bearer the Google chat/Veo paths use), POSTing to aiplatform.googleapis.com/v1/projects/{P}/locations/{global}/publishers/ google/models/{model}:generateContent. Body, PCM extraction, WAV-wrap, and opus transcode are shared verbatim with the AI-Studio route. - Route selection (resolveGoogleTtsPcm): AI-Studio key route stays primary; fall to the Vertex ADC route when no key but ADC is present; throw with neither so the speech provider-order fallback (Cloud TTS -> text) trips on a detected failure, never a silent degrade (ADR 0024 clause 2). - isConfigured is now ADC-aware so the provider is selected keyless. - Extract buildGoogleSpeechGenerateContentBody (shared by both routes). - Test: Vertex generateContent URL shape (global + regional). Implements the membrane row of tulgey#247 / ADR 0024. Existing AI-Studio tests unaffected (real keys take the unchanged route). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-09T02:32:57Z

Warning

Review limit reached

@matin, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 28 minutes and 16 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 41ba5306-7fd6-4535-a83f-9a8e7443bcc4

📥 Commits

Reviewing files that changed from the base of the PR and between 4910740 and b5a447e.

📒 Files selected for processing (3)

extensions/google/speech-provider.ts
extensions/google/video-generation-provider.ts
scripts/check-no-raw-channel-fetch.mjs

📝 Walkthrough

Walkthrough

The PR adds keyless Google Vertex ADC text-to-speech as a fallback alternative to the existing AI-Studio API key route. It introduces a shared request-body builder for both backends, implements Vertex-specific URL construction and ADC-authenticated synthesis with timeout and retry handling, adds a route selector function, and updates configuration detection and handler wiring to use the new routing logic.

Changes

Google Vertex ADC TTS Support

Layer / File(s)	Summary
Shared generateContent request builder `extensions/google/speech-provider.ts`	Introduces `buildGoogleSpeechGenerateContentBody` to centralize request payload construction for text composition and audio config, used by both AI-Studio and Vertex routes.
Vertex ADC synthesis with route selection `extensions/google/speech-provider.ts`	Adds Vertex ADC detection imports, builds the regional/global `generateContent` endpoint URL, implements ADC-authenticated POST synthesis with abort/timeout and retry classification, and introduces `resolveGoogleTtsPcm` to select between AI-Studio-key and Vertex-ADC routes.
Handler and configuration updates `extensions/google/speech-provider.ts`	Updates `isConfigured` to report configured when either a valid AI-Studio key (non-Vertex-marker) or detectable Vertex ADC is available. Refactors `synthesize` and `synthesizeTelephony` to call `resolveGoogleTtsPcm` instead of checking keys directly.
Testing exports and test suite `extensions/google/speech-provider.ts`, `extensions/google/speech-provider.test.ts`	Extends `__testing` export with `buildGoogleVertexTtsUrl` and `googleVertexTtsAdcAvailable` helpers. Adds test cases verifying URL construction for global and regional locations.

Sequence Diagram

sequenceDiagram
  participant App
  participant GoogleProvider
  participant RouteResolver
  participant AIStudioRoute
  participant VertexADCRoute
  participant GoogleAPIs
  
  App->>GoogleProvider: synthesize(text, config)
  GoogleProvider->>RouteResolver: resolveGoogleTtsPcm(text, config)
  
  alt API Key Present and Not Vertex Marker
    RouteResolver->>AIStudioRoute: use AI-Studio key route
    AIStudioRoute->>GoogleAPIs: POST generateContent with API key
    GoogleAPIs-->>AIStudioRoute: PCM audio
    AIStudioRoute-->>RouteResolver: PCM bytes
  else Vertex ADC Available
    RouteResolver->>VertexADCRoute: use Vertex ADC route
    VertexADCRoute->>GoogleAPIs: POST generateContent with ADC headers
    GoogleAPIs-->>VertexADCRoute: PCM audio
    VertexADCRoute-->>RouteResolver: PCM bytes
  else Neither Available
    RouteResolver-->>GoogleProvider: throw error for fallback
  end
  
  RouteResolver-->>GoogleProvider: PCM audio
  GoogleProvider-->>App: synthesized audio

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

A rabbit hops through Vertex gates,
No API keys to hesitate,
ADC credentials light the way,
Two routes now serve the TTS day! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.77% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding native audio output via Vertex ADC route for voice-note synthesis on keyless deployments, with a direct reference to the GitHub issue.
Description check	✅ Passed	The PR description covers most required sections including summary, context, changes, and verification, but lacks formal completion of the template sections like risk checklist and current review state.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch native-audio-output

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

lint:extensions:bundled lints the whole extensions/google package, so these errors (introduced with the Veo REST fallback in #5, never linted since no later PR touched the package) block any PR that touches the extension. Surfaced by the native-audio-output change. - resolveVertexOAuthToken: brace the metadata-token if, type res.json() as { access_token?: string } (drops the unnecessary `as any`), and omit the unused catch binding. - brace the "Force rest fallback for Vertex" guard. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The new Vertex ADC route used a raw fetch(), which trips the no-raw-channel-fetch boundary guard. Route it through postJsonRequest (the same guarded helper the AI-Studio route uses) so SSRF/dispatcher policy and timeout handling apply uniformly; drop the manual AbortController. Also allowlist the pre-existing Veo metadata-server fetch (video-generation-provider.ts:44, http://metadata.google.internal — link-local, must be raw; the SSRF guard intentionally blocks it). It predates this work and was surfaced when the PR first touched the package. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

matin and others added 2 commits June 8, 2026 19:43

matin merged commit 48a00b0 into main Jun 9, 2026
128 of 138 checks passed

matin deleted the native-audio-output branch June 9, 2026 03:17

matin mentioned this pull request Jun 9, 2026

fix(speech): set speechConfig.languageCode so Spanish TTS isn't gringo-accented (tulgey #251) #18

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(speech): native audio output via Vertex ADC route (tulgey #247)#17

feat(speech): native audio output via Vertex ADC route (tulgey #247)#17
matin merged 3 commits into
mainfrom
native-audio-output

matin commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

matin commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

What changed

Verification

Deploy (follow-up, not in this PR)

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

matin commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading