Skip to content

Consolidate membrane VM hot-patches onto fork-main (Veo + companions)#5

Merged
matin merged 1 commit into
mainfrom
issue-218-vm-patch-consolidate
Jun 2, 2026
Merged

Consolidate membrane VM hot-patches onto fork-main (Veo + companions)#5
matin merged 1 commit into
mainfrom
issue-218-vm-patch-consolidate

Conversation

@matin

@matin matin commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Transplants the still-needed /opt/openclaw local hot-patches onto fork-main so the membrane VM can cut over to the fork and ship native audio (#4). Veo (REST-bypass + 1080p), generation-provider-metadata, register.send — all upstream-untouched in 99d96c1→94db48d0, transplanted verbatim. session-lock openclaw#195 dropped (upstream no-op'd it). dispatch.ts (ADR 0015 inbound-sequencing) + dotenv deferred as fast-follow. Refs imperfect-co/tulgey#218.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --as-voice flag to CLI send command for sending audio as voice notes.
  • Updates

    • Updated Google video generation models to the latest veo-3.1 series.
    • Enhanced video generation with support for additional input formats and improved polling reliability.
    • Clarified --force-document flag documentation for Telegram-specific media handling.

Consolidates the still-needed local /opt/openclaw hot-patches onto fork-main
(= upstream 94db48d + native audio #4) so the membrane VM can cut over to the
fork and ship native audio ingestion. These three files were untouched by
upstream in the 99d96c1→94db48d0 window, so they transplant verbatim:

- extensions/google/video-generation-provider.ts — the openclaw#172 Vertex REST-bearer
  bypass (load-bearing per tulgey#194; SDK auth path is the openclaw#175 bug) + the #3
  default-1080p resolution.
- extensions/google/generation-provider-metadata.ts — Veo companion.
- src/cli/program/message/register.send.ts — companion.

Dropped: the session-lock patch (openclaw#195) — upstream made waitForSessionEventQueue
a no-op by 94db48d, so it is obsolete.

Deferred (fast-follow, refs tulgey#218): src/auto-reply/dispatch.ts (the ADR
0015 inbound-message-sequencing coalescing rewrite) and src/infra/dotenv.ts —
both conflict structurally with fork-main and need a careful port + review.

Refs imperfect-co/tulgey#218.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 120403aa-f416-43ef-b15d-6800671a2f84

📥 Commits

Reviewing files that changed from the base of the PR and between f837a17 and ee9d10e.

📒 Files selected for processing (3)
  • extensions/google/generation-provider-metadata.ts
  • extensions/google/video-generation-provider.ts
  • src/cli/program/message/register.send.ts

📝 Walkthrough

Walkthrough

This PR upgrades Google video generation models from preview to release versions, adds Vertex AI OAuth authentication with model aliasing, extends REST-based generation with configurable polling and richer request payloads (including image/video bytes), and introduces a CLI flag for voice note formatting in message sends.

Changes

Google Video Generation Provider Enhancements

Layer / File(s) Summary
Model Metadata Update
extensions/google/generation-provider-metadata.ts
DEFAULT_GOOGLE_VIDEO_MODEL updated from veo-3.1-fast-generate-preview to veo-3.1-generate-001; supported models list replaced preview identifiers with veo-3.1-fast-generate-001 and veo-3.1-lite-generate-001 while retaining veo-3.0-* and veo-2.0-* entries.
Authentication and Model Resolution
extensions/google/video-generation-provider.ts
Introduces resolveVertexOAuthToken() for Vertex service-account OAuth fetching, adds GOOGLE_VIDEO_MODEL_ALIASES map for model string normalization, and updates resolveGoogleVideoModel() to strip known prefixes and apply alias resolution for incoming model identifiers.
REST Request and Response Handling
extensions/google/video-generation-provider.ts
Makes apiKey optional in download URL resolution, adds response.videos array parsing path, extends REST request payload to include base64-encoded image and video bytes with MIME types, and implements configurable pollStyle with :fetchPredictOperation POST endpoint and project-scoped URL rewriting.
Generation Flow Refactoring
extensions/google/video-generation-provider.ts
Refactors generateVideo() to compute Vertex context and branch initialization between Vertex and SDK configurations, updates error handling with Vertex-specific gates, passes image/video/pollStyle to REST fallback calls, uses dedicated SDK client reference for polling, and enhances inline video handling to accept bytesBase64Encoded, uri, and gcsUri with stricter final validation.

CLI Voice Note Support

Layer / File(s) Summary
Voice Note Flag Registration
src/cli/program/message/register.send.ts
Adds --as-voice boolean flag (default false) to send command for audio formatting, and updates --force-document help text to clarify Telegram-only compression avoidance.

🎯 3 (Moderate) | ⏱️ ~25 minutes

A rabbit hops with joy
Models upgraded, voices arise
REST and Vertex unite
Each video finds its path
Onwards to generation's height! 🐰✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-218-vm-patch-consolidate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@matin matin merged commit 8989095 into main Jun 2, 2026
19 of 30 checks passed
matin added a commit that referenced this pull request Jun 3, 2026
Consolidates the still-needed local /opt/openclaw hot-patches onto fork-main
(= upstream 94db48d + native audio #4) so the membrane VM can cut over to the
fork and ship native audio ingestion. These three files were untouched by
upstream in the 99d96c1→94db48d0 window, so they transplant verbatim:

- extensions/google/video-generation-provider.ts — the openclaw#172 Vertex REST-bearer
  bypass (load-bearing per tulgey#194; SDK auth path is the openclaw#175 bug) + the #3
  default-1080p resolution.
- extensions/google/generation-provider-metadata.ts — Veo companion.
- src/cli/program/message/register.send.ts — companion.

Dropped: the session-lock patch (openclaw#195) — upstream made waitForSessionEventQueue
a no-op by 94db48d, so it is obsolete.

Deferred (fast-follow, refs tulgey#218): src/auto-reply/dispatch.ts (the ADR
0015 inbound-message-sequencing coalescing rewrite) and src/infra/dotenv.ts —
both conflict structurally with fork-main and need a careful port + review.

Refs imperfect-co/tulgey#218.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
matin added a commit that referenced this pull request Jun 9, 2026
lint:extensions:bundled lints the whole extensions/google package, so
these errors (introduced with the Veo REST fallback in #5, never linted
since no later PR touched the package) block any PR that touches the
extension. Surfaced by the native-audio-output change.

- resolveVertexOAuthToken: brace the metadata-token if, type res.json()
  as { access_token?: string } (drops the unnecessary `as any`), and
  omit the unused catch binding.
- brace the "Force rest fallback for Vertex" guard.

No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
matin added a commit that referenced this pull request Jun 9, 2026
…aw#247) (#17)

* feat(speech): native audio output via Vertex ADC route (tulgey openclaw#247)

The Google speech provider already emits native generateContent AUDIO
(gemini-3.1-flash-tts-preview, responseModalities:['AUDIO'] + speechConfig)
and already transcodes to opus-in-ogg for voice-note delivery. The only
gap was auth: it knew the AI-Studio key route only and threw "Google API
key missing" on a keyless Vertex deployment (tulgey #10). This adds the
Vertex ADC route so native output is the primary path on the deployment.

- Add a Vertex ADC synthesis route (synthesizeGoogleVertexTtsPcm) that
  rides resolveGoogleVertexAuthorizedUserHeaders (the same ADC bearer the
  Google chat/Veo paths use), POSTing to
  aiplatform.googleapis.com/v1/projects/{P}/locations/{global}/publishers/
  google/models/{model}:generateContent. Body, PCM extraction, WAV-wrap,
  and opus transcode are shared verbatim with the AI-Studio route.
- Route selection (resolveGoogleTtsPcm): AI-Studio key route stays primary;
  fall to the Vertex ADC route when no key but ADC is present; throw with
  neither so the speech provider-order fallback (Cloud TTS -> text) trips
  on a detected failure, never a silent degrade (ADR 0024 clause 2).
- isConfigured is now ADC-aware so the provider is selected keyless.
- Extract buildGoogleSpeechGenerateContentBody (shared by both routes).
- Test: Vertex generateContent URL shape (global + regional).

Implements the membrane row of tulgey#247 / ADR 0024. Existing AI-Studio
tests unaffected (real keys take the unchanged route).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(google): clear pre-existing oxlint errors in the Veo provider

lint:extensions:bundled lints the whole extensions/google package, so
these errors (introduced with the Veo REST fallback in #5, never linted
since no later PR touched the package) block any PR that touches the
extension. Surfaced by the native-audio-output change.

- resolveVertexOAuthToken: brace the metadata-token if, type res.json()
  as { access_token?: string } (drops the unnecessary `as any`), and
  omit the unused catch binding.
- brace the "Force rest fallback for Vertex" guard.

No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(speech): route Vertex TTS through guarded postJsonRequest

The new Vertex ADC route used a raw fetch(), which trips the
no-raw-channel-fetch boundary guard. Route it through postJsonRequest
(the same guarded helper the AI-Studio route uses) so SSRF/dispatcher
policy and timeout handling apply uniformly; drop the manual
AbortController.

Also allowlist the pre-existing Veo metadata-server fetch
(video-generation-provider.ts:44, http://metadata.google.internal —
link-local, must be raw; the SSRF guard intentionally blocks it). It
predates this work and was surfaced when the PR first touched the
package.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant