Skip to content

feat(google): add Google Vertex AI provider with ADC auth and global endpoint routing#1966

Open
BingqingLyu wants to merge 771 commits into
mainfrom
fork-pr-60860-feat-google-vertex-provider
Open

feat(google): add Google Vertex AI provider with ADC auth and global endpoint routing#1966
BingqingLyu wants to merge 771 commits into
mainfrom
fork-pr-60860-feat-google-vertex-provider

Conversation

@BingqingLyu

@BingqingLyu BingqingLyu commented Apr 28, 2026

Copy link
Copy Markdown
Owner

Summary

  • Registers a new google-vertex provider in the Google plugin that routes to aiplatform.googleapis.com using Application Default Credentials (ADC), separate from the existing google-gemini-cli OAuth path
  • Adds vertex-region.ts for region/project/baseUrl resolution (env vars + ADC file fallback), with global location producing the unprefixed aiplatform.googleapis.com endpoint
  • Adds vertex-provider-catalog.ts with the Gemini 3.x model catalog for Vertex AI
  • Updates the Google transport stream to detect Vertex endpoints and construct the correct project/location URL path (/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:streamGenerateContent)
  • Exchanges ADC refresh token for a Bearer token at request time, handling the pi-ai "<authenticated>" sentinel that was previously sent as a literal API key causing 401s
  • Captures thought_signature from Gemini 3 thinking-mode tool call responses so multi-turn conversations replay correctly (fixes 400 INVALID_ARGUMENT on second turn)
  • Adds GCP_VERTEX_GOOGLE_CREDENTIALS_MARKER sentinel and registers google-vertex in openclaw.plugin.json

Closes openclaw#49039
Closes openclaw#56253
Closes openclaw#58775
Closes openclaw#60736

Test plan

  • pnpm test extensions/google/vertex-region.test.ts — 20 cases, all pass
  • pnpm test extensions/google/vertex-provider-catalog.test.ts — 7 cases, all pass
  • pnpm test src/infra/gemini-auth.test.ts — 5 cases, all pass
  • Live: google-vertex/gemini-3-flash-preview with GOOGLE_CLOUD_LOCATION=global and GOOGLE_CLOUD_PROJECT set — correct Vertex URL, Bearer auth, successful response

steipete and others added 30 commits April 27, 2026 21:00
Add an end anchor to the type/subtype match and explicitly accept the
RFC 9110 ;parameter tail. Inputs like "image/png<script>" or
"application/json garbage" now return undefined instead of silently
matching the leading prefix.

Closes openclaw#9795
When the post-update completion cache refresh times out (slow disk,
large bundled plugin tree, Docker overlayfs), the user previously saw
the opaque 'Completion cache update failed: Error: spawnSync
/usr/bin/node ETIMEDOUT'. Detect ETIMEDOUT specifically, surface
'timed out after 30s', and append a manual refresh hint pointing at
'openclaw completion --write-state' so users know it's non-fatal and
how to recover.

Fixes openclaw#72842
Commit 2cd2395 ("build: use slim docker runtime") switched the
runtime image from `node:24-bookworm` (full) to `node:24-bookworm-slim`.
The slim base does not ship `ca-certificates`, and the runtime stage's
`apt-get install` line was not updated to add it.

Result on the resulting image:
- `/etc/ssl/certs/` is empty (`ls /etc/ssl/certs/ | wc -l` == 0)
- `dpkg -l ca-certificates` reports `un` (not installed)
- `update-ca-certificates` is missing in `$PATH` (exit 127)
- every HTTPS outbound from the gateway dies at TLS handshake with
  `error setting certificate file: /etc/ssl/certs/ca-certificates.crt`
- channel plugins that use `node fetch` (telegram/discord/slack)
  crash-loop with `Network request for 'deleteWebhook' failed!`
  and pin the gateway main thread at ~100% CPU on retry.

Verified by rebuilding the runtime image with this patch and
confirming inside the container:
- `ls /etc/ssl/certs/ | wc -l` -> 285
- `curl -4 https://api.telegram.org/` -> 302
- `curl -4 https://www.google.com/`   -> 200
- channel plugins (telegram/discord/slack) register cleanly,
  gateway main-thread CPU returns to idle.

Add `ca-certificates` to the apt-install list and call
`update-ca-certificates` to populate the CA bundle.

Signed-off-by: ryuhaneul <luj.moonlight@gmail.com>
…label limit

When the system hostname exceeds 63 bytes (common with Kubernetes pod
names), the @homebridge/ciao DNS label encoder throws an AssertionError
that crashes the gateway on startup.

Add truncateToDnsLabel() that safely truncates UTF-8 strings at byte
boundaries, applied to both the service instance name and hostname
before passing them to ciao.

Closes openclaw#37705

AI-assisted (built with Hermes orchestration).
Closes openclaw#72837. The 15s narrative-subagent timeout was empirically too
tight for warm-gateway runs across light, REM, and deep phases —
gpt-5.4-mini latency through OpenAI alone routinely brushes 12s+, so the
first sweep after a restart deterministically times out across all three
phases. 60s gives realistic LLM-call headroom while still capping the
worst case at one minute, preserving the original comment's "don't leave
parent cron running for minutes" constraint.

Test: updates the matching toMatchObject assertion in
dreaming-narrative.test.ts from 15_000 to 60_000.
When plugins register hooks via api.registerHook(), pluginConfig from
openclaw.json was not available in the hook event context. Plugins that
accessed ctx.pluginConfig or event.context.pluginConfig received
undefined, causing silent failures or fallback to defaults.

Changes:
- Add pluginConfig parameter to registerHook() function
- Wrap handler to inject pluginConfig into event.context before invocation
- Pass params.pluginConfig through createApi() call site

Fixes openclaw#72880
Address review feedback on PR openclaw#72888. triggerInternalHook passes the
same event reference to all handlers sequentially. Mutating evt.context
leaks pluginConfig to subsequent handlers and causes cross-plugin
overwrites. Shallow-copy event and context instead.
The ACP dispatch path calls applyMediaUnderstanding without the agentDir
parameter. This prevents the media understanding pipeline from locating
agent-specific models.json and auth profiles, causing image understanding
to fail silently for non-visual models configured with a separate image
understanding model.

The non-ACP reply path (get-reply.ts) already passes agentDir correctly.
This aligns the ACP path with the same behavior.

Closes openclaw#55046

AI-assisted (built with Hermes orchestration).
…ups (openclaw#67687)

Feishu config defaults groupPolicy to 'allowlist'. Inbound group handling read groupAllowFrom and called isFeishuGroupAllowed before resolveFeishuReplyPolicy was reached, so a config that only set channels.feishu.groups.<chat_id>.requireMention=false (with no groupAllowFrom) was rejected with 'group not in groupAllowFrom' before per-group requireMention could take effect. Treat the explicit presence of a group entry under channels.feishu.groups as the operator's allowlist signal: if groupConfig is defined, skip the empty-allowlist rejection. resolveFeishuReplyPolicy still owns mention gating, and existing groupConfig.enabled=false / groupAllowFrom-driven rejections are preserved. Adds a regression test that exercises the reporter's exact config shape and confirms inbound text reaches finalize/dispatch.
steipete and others added 30 commits April 28, 2026 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment