fix(google): preserve Vertex ADC catalog auth#90609
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 5, 2026, 7:47 AM ET / 11:47 UTC. Summary PR surface: Source +61, Tests +146. Total +207 across 8 files. Reproducibility: yes. Source inspection shows current main drops non-env ADC auth evidence before the writable-provider gate, and the linked issue reports the resulting google-vertex model_not_found runtime failure on released builds. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the guarded marker-based fix after required CI and maintainer auth-provider review; run a live Vertex ADC smoke only if credentials are readily available. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection shows current main drops non-env ADC auth evidence before the writable-provider gate, and the linked issue reports the resulting google-vertex model_not_found runtime failure on released builds. Is this the best way to solve the issue? Yes. The PR repairs the existing auth-evidence-to-generated-config path and reuses the provider-owned non-secret marker, which is narrower than adding config, relaxing the writable gate, or requiring manual auth-profile setup. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 12a569109b60. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +61, Tests +146. Total +207 across 8 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Merging, accepting risk of [P1] Auth/config generation is upgrade-sensitive: incorrect marker handling could either keep dropping google-vertex catalog rows or write an inappropriate marker for ADC users:
|
* fix: preserve Google Vertex ADC catalog auth * fix: register Google Vertex ADC config marker * fix: fill Vertex ADC static catalog auth
* fix: preserve Google Vertex ADC catalog auth * fix: register Google Vertex ADC config marker * fix: fill Vertex ADC static catalog auth
* fix: preserve Google Vertex ADC catalog auth * fix: register Google Vertex ADC config marker * fix: fill Vertex ADC static catalog auth
…26.6.5) (#963) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.1` → `2026.6.5` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.6.5`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202665) [Compare Source](openclaw/openclaw@v2026.6.1...v2026.6.5) ##### Highlights - QQBot now strips model reasoning/thinking scaffolding before native delivery, preventing raw `<thinking>` content from leaking into channel replies. ([#​89913](openclaw/openclaw#89913), [#​90132](openclaw/openclaw#90132)) Thanks [@​openperf](https://github.com/openperf). - MCP tool results now coerce `resource_link`, `resource`, `audio`, malformed image, and future non-text/image blocks at the materialize boundary, preventing Anthropic 400s and poisoned session history after a tool returns richer MCP content. ([#​90710](openclaw/openclaw#90710), [#​90728](openclaw/openclaw#90728)) Thanks [@​RanSHammer](https://github.com/RanSHammer) and [@​849261680](https://github.com/849261680). - Anthropic extended-thinking sessions recover after prompt-cache expiry or Gateway restart because stream start events wait for `message_start`, letting pre-generation signature errors trigger the existing recovery retry. ([#​90667](openclaw/openclaw#90667), [#​90697](openclaw/openclaw#90697)) Thanks [@​openperf](https://github.com/openperf). - Parallel is now a bundled `web_search` provider with `PARALLEL_API_KEY` discovery, guarded endpoint handling, cache-safe session ids, onboarding picker support, and docs. ([#​85158](openclaw/openclaw#85158)) Thanks [@​NormallyGaussian](https://github.com/NormallyGaussian). - Google Vertex ADC users get static catalog rows and runtime model resolution again, while single-provider cooldown recovery and memory adapter status checks are more reliable. ([#​90506](openclaw/openclaw#90506), [#​90609](openclaw/openclaw#90609), [#​90717](openclaw/openclaw#90717), [#​90816](openclaw/openclaw#90816)) Thanks [@​849261680](https://github.com/849261680). - Matrix can preflight voice notes before mention gating, preserve thread reads/replies through Matrix relations pagination, and carry QA coverage for voice and thread flows. ([#​78016](openclaw/openclaw#78016), [#​90415](openclaw/openclaw#90415)) - Auth and plugin install state is more durable: auth profiles now live in SQLite, official npm plugin install records keep their trusted pins, and prerelease fallback integrity checks avoid carrying stale integrity forward. ([#​89102](openclaw/openclaw#89102), [#​88585](openclaw/openclaw#88585)) - macOS node mode no longer silently self-reconnects away from a healthy direct Gateway session, reducing unexpected companion app session churn. ([#​90668](openclaw/openclaw#90668), [#​90815](openclaw/openclaw#90815)) Thanks [@​vrurg](https://github.com/vrurg). - Upgrade and service paths are safer: cron legacy JSON stores migrate during doctor preflight, service env placeholders no longer mask state-dir secrets, WhatsApp startup waits are bounded, and disabled WhatsApp accounts tear down on config reload. ([#​90072](openclaw/openclaw#90072), [#​90208](openclaw/openclaw#90208), [#​90277](openclaw/openclaw#90277), [#​90488](openclaw/openclaw#90488), [#​90486](openclaw/openclaw#90486), [#​87951](openclaw/openclaw#87951), [#​87965](openclaw/openclaw#87965)) Thanks [@​MonkeyLeeT](https://github.com/MonkeyLeeT), [@​sallyom](https://github.com/sallyom), [@​mcaxtr](https://github.com/mcaxtr), and [@​MukundaKatta](https://github.com/MukundaKatta). ##### Changes - Search/providers: add the Parallel bundled web-search plugin, live provider tests, registration contracts, onboarding/docs wiring, and guarded `api.parallel.ai/v1/search` support. ([#​85158](openclaw/openclaw#85158)) Thanks [@​NormallyGaussian](https://github.com/NormallyGaussian). - Matrix/channels: add voice-message preflight and thread-aware read/reply behavior, including Matrix QA scenario wiring and docs for voice-message behavior. ([#​78016](openclaw/openclaw#78016), [#​90415](openclaw/openclaw#90415)) - Skills/ClawHub: install ClawHub skills backed by GitHub repositories through the resolved install API, download the pinned GitHub commit, keep install-policy checks, and report install telemetry after success. ([#​90478](openclaw/openclaw#90478)) Thanks [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen). - Google Chat/channels: add native approval card actions and click handling so Google Chat approvals use platform-native cards instead of generic message flow. - Mobile: Android provider/model screens now surface expiring, unavailable, unresolved, and attention states more clearly, while iOS settings and Talk tabs keep diagnostics, gateway rows, attachment labels, and unavailable Talk controls reachable. - Memory: QMD search can use the new rerank toggle, and memory adapter status uses the resolved default model identity when checking plain status. ([#​61834](openclaw/openclaw#61834)) - Docs/tooling: add Parallel search docs, refresh weather-skill guidance toward `web_fetch`, clarify legacy `openai-codex` auth, document release/test helper scripts, and tighten changed-test routing docs for CI/debugging work. ([#​90028](openclaw/openclaw#90028), [#​90250](openclaw/openclaw#90250)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev). - Release/process: switch release trains to `YYYY.M.PATCH` monthly patch numbering, keep pre-transition tags compatible, and pin the June 2026 floor at `2026.6.5` after the published beta. - Platform maintenance: refresh Android, Swift/macOS, Docker, CodeQL, Buildx, Docker build/push, and Codex Action dependencies for this release train. ([#​74980](openclaw/openclaw#74980), [#​81757](openclaw/openclaw#81757), [#​86481](openclaw/openclaw#86481), [#​86483](openclaw/openclaw#86483), [#​90601](openclaw/openclaw#90601)) - QQBot: add `/bot-group-allways on|off` slash command (with named-account and default-account support) to toggle whether group messages require an `@mention` before the bot replies, and clear the runtime config snapshot after the write so the new account-level `defaultRequireMention` takes effect immediately without restart. ([#​91423](openclaw/openclaw#91423)) Thanks [@​cxyhhhhh](https://github.com/cxyhhhhh). ##### Fixes - Channel content boundaries: QQBot now strips reasoning/thinking tags before sending, preserving final answers while hiding internal model narration from users. ([#​89913](openclaw/openclaw#89913), [#​90132](openclaw/openclaw#90132)) Thanks [@​openperf](https://github.com/openperf). - Agents/MCP/providers: coerce non-text/image MCP tool-result blocks before they reach provider converters, preserving valid images and turning richer MCP content into text instead of malformed image blocks. ([#​90710](openclaw/openclaw#90710), [#​90728](openclaw/openclaw#90728)) Thanks [@​RanSHammer](https://github.com/RanSHammer) and [@​849261680](https://github.com/849261680). - Anthropic/Codex/ACP/agent recovery: defer Anthropic stream start events until `message_start`, strip stale compaction thinking signatures before Anthropic replay, detect unsigned thinking-only stalls, refresh prompt fences after compaction writes, reject empty completion handoffs, preserve parent streaming-off overrides/shared progress commentary, forward heartbeat metadata to context-engine hooks, and cover Codex session/thread migration edge cases. ([#​90667](openclaw/openclaw#90667), [#​90697](openclaw/openclaw#90697), [#​90163](openclaw/openclaw#90163), [#​90108](openclaw/openclaw#90108), [#​89874](openclaw/openclaw#89874), [#​89505](openclaw/openclaw#89505), [#​90632](openclaw/openclaw#90632), [#​89302](openclaw/openclaw#89302), [#​90729](openclaw/openclaw#90729), [#​90317](openclaw/openclaw#90317), [#​90319](openclaw/openclaw#90319)) Thanks [@​openperf](https://github.com/openperf), [@​100yenadmin](https://github.com/100yenadmin), and [@​ooiuuii](https://github.com/ooiuuii). - Provider/model resolution: preserve Google Vertex ADC auth markers in generated catalogs, re-probe a single-provider primary after cooldown, share Codex model visibility, fail closed for unknown model auth, preserve Codex alias availability, keep unresolved profile refs unknown, and avoid resolving auth while listing models. ([#​90506](openclaw/openclaw#90506), [#​90609](openclaw/openclaw#90609), [#​90717](openclaw/openclaw#90717), [#​90702](openclaw/openclaw#90702)) Thanks [@​849261680](https://github.com/849261680). - Gateway/macOS/mobile: avoid duplicate Gateway probe warnings by identity, rate-limit node pairing requests while preserving paired-node reconnects, keep macOS node mode on a healthy direct Gateway session, keep iOS diagnostics and gateway rows reachable, and avoid Linux ARM Gradle resource tasks during Android builds. ([#​85791](openclaw/openclaw#85791), [#​90147](openclaw/openclaw#90147), [#​90668](openclaw/openclaw#90668), [#​90815](openclaw/openclaw#90815)) Thanks [@​giodl73-repo](https://github.com/giodl73-repo) and [@​vrurg](https://github.com/vrurg). - TUI/chat/Workboard/auto-reply: optimistic user messages stay stable across stale history reloads, runId reassignment, and abort windows instead of disappearing, jumping, or lingering as ghost rows; Workboard stale lifecycle bulk updates no longer overwrite newer status/provenance; message-tool sends now count as delivery. ([#​86205](openclaw/openclaw#86205), [#​89600](openclaw/openclaw#89600), [#​88592](openclaw/openclaw#88592), [#​90123](openclaw/openclaw#90123)) Thanks [@​RomneyDa](https://github.com/RomneyDa). - Cron/update/service env: doctor config preflight now migrates legacy cron JSON stores into SQLite before runtime reads, service env planning skips unresolved placeholders that would mask state-dir `.env` values, and session transcript rewrites keep registry markers/discriminants consistent. ([#​90072](openclaw/openclaw#90072), [#​90208](openclaw/openclaw#90208), [#​90277](openclaw/openclaw#90277), [#​90488](openclaw/openclaw#90488)) Thanks [@​MonkeyLeeT](https://github.com/MonkeyLeeT) and [@​sallyom](https://github.com/sallyom). - Security/config/tooling: guard MCP HTTP redirects, protect global agent config defaults, and keep release/test/tooling proof failures bounded and explicit. ([#​89732](openclaw/openclaw#89732), [#​90145](openclaw/openclaw#90145)) - Channels: WhatsApp restarts when per-account config changes, bounds background startup waits, closes failed sockets, and preserves reconnect behavior; Mattermost slash commands keep their state on `globalThis`; Feishu streaming cards preserve full merged content; voice-call tracks Twilio streams after connect; ClickClack reply tools respect `toolsAllow`. ([#​87951](openclaw/openclaw#87951), [#​87965](openclaw/openclaw#87965), [#​90486](openclaw/openclaw#90486), [#​68113](openclaw/openclaw#68113), [#​90534](openclaw/openclaw#90534), [#​90181](openclaw/openclaw#90181), [#​90607](openclaw/openclaw#90607), [#​89500](openclaw/openclaw#89500)) Thanks [@​MukundaKatta](https://github.com/MukundaKatta), [@​mcaxtr](https://github.com/mcaxtr), [@​infoanton](https://github.com/infoanton), [@​mushuiyu886](https://github.com/mushuiyu886), and [@​sahibzada-allahyar](https://github.com/sahibzada-allahyar). - Feishu: retry transient send rate-limit errors (HTTP 429, per-chat code 230020, tenant-level code 11232) with linear backoff, including SDK responses that fulfill with rate-limit bodies instead of throwing, and route streaming-card sends through the retry wrapper. ([#​89659](openclaw/openclaw#89659)) Thanks [@​ladygege](https://github.com/ladygege). - Release/CI/E2E: main CI guard drift, PR merge diff scoping, live Docker credential staging, base-image qualification, installer Docker classification, Playwright dependency install recovery, API-key auth for Codex live Docker lanes, Parallels option terminators, and JSON-mode progress handling are tighter so release proof fails cleaner. ([#​90532](openclaw/openclaw#90532), [#​90287](openclaw/openclaw#90287), [#​90058](openclaw/openclaw#90058)) Thanks [@​RomneyDa](https://github.com/RomneyDa), [@​hxy91819](https://github.com/hxy91819), and [@​mrunalp](https://github.com/mrunalp). - Release/CI/E2E: Docker E2E and live Docker harness runs now apply default memory, CPU, and process ceilings while preserving explicit per-lane overrides. - Release/CI/E2E: plugin lifecycle matrix resource sampling now fails phases that exceed RSS, wall-clock, or CPU ceilings instead of only logging the measurements. - Release/CI/E2E: Codex npm plugin live assertions now cap transcript discovery and diagnostic log reads so failure proof stays bounded. - Tests/state isolation: QA Lab valid-tool-call metrics now require runtime tool-call evidence when runtime parity data is available instead of counting tool-backed scenario pass status alone. - Tests/state isolation: QA Lab runtime parity now fails planned-only tool-call rows without matching tool results instead of treating matching mock plans as real tool evidence. - Tests/state isolation: provider, media, auth, cron, task, session, sandbox, Gateway, and Codex timeout fixtures now scope more home/state/env data per test, reducing cross-test leakage and making release validation failures less noisy. ([#​90027](openclaw/openclaw#90027), [#​89974](openclaw/openclaw#89974)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/963
Summary
What problem does this PR solve?
models.jsonand plugin catalog generation dropping staticgoogle-vertexcatalog providers when auth is available through ADC evidence instead of an API-key env var or auth profile.gcp-vertex-credentialsmarker instead of writing credential material or adding new config.Why does this matter now?
google-vertex/*models appear in model listing/startup but fail at runtime withmodel_not_foundfor ADC-backed users.What is the intended outcome?
google-vertex/*models while the Google transport continues resolving bearer auth from ADC at request time.What is intentionally out of scope?
What does success look like?
GOOGLE_APPLICATION_CREDENTIALS, eitherGOOGLE_CLOUD_PROJECTorGCLOUD_PROJECT, andGOOGLE_CLOUD_LOCATIONpresent, generated catalog data keeps thegoogle-vertexprovider withapiKey: "gcp-vertex-credentials"and its static model rows.What should reviewers focus on?
Linked context
Which issue does this close?
Closes #90506
Which issues, PRs, or discussions are related?
Related #65715, #56253
Was this requested by a maintainer or owner?
Real behavior proof (required for external PRs)
Behavior addressed: Google Vertex ADC-backed static catalog providers were filtered from generated model config, making
google-vertex/*fail asmodel_not_foundat runtime despite valid ADC evidence.Real environment tested: Local OpenClaw source checkout on macOS with a temporary ADC
application_default_credentials.jsonfile, process envGOOGLE_APPLICATION_CREDENTIALS,GOOGLE_CLOUD_PROJECT, andGOOGLE_CLOUD_LOCATION, using the realensureOpenClawModelsJsonstartup catalog-generation entry point.Exact steps or command run after this patch:
node --import tsxsource-checkout probe that created a temporary ADC credentials file, set Google Vertex ADC process env, calledensureOpenClawModelsJson({ models: { providers: {} } }, agentDir, { workspaceDir, providerDiscoveryProviderIds: ["google-vertex"], providerDiscoveryEntriesOnly: true, providerDiscoveryTimeoutMs: 60000 }), and read the generatedplugins/google/catalog.jsonsidecar.Evidence after fix: Console output from the generated catalog probe:
{ "wrote": true, "pluginFiles": [ "catalog.json" ], "rootProviderIds": [], "generatedBy": "openclaw-plugin-model-catalog-v1", "googleCatalogProviderIds": [ "google-vertex" ], "googleVertexApiKey": "gcp-vertex-credentials", "googleVertexModelCount": 6, "sampleGoogleVertexModels": [ "gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.5-flash-lite" ] }Observed result after fix: The same startup entries-only generation path now writes the Google plugin catalog sidecar, keeps the
google-vertexprovider row, preserves the existing non-secretgcp-vertex-credentialsmarker, and keeps static model rows without persisting credential material.What was not tested: A live Vertex AI network request with real Google Cloud ADC credentials. No real Google Cloud credentials were available, and the proof intentionally uses only local ADC evidence and generated config output.
Before evidence: The new models-config regression failed before the implementation because the
google-vertexprovider was filtered out of generated models config. A local CLI attempt with temp ADC env reproduced the reportedmodel_not_foundbehavior before the plugin-owned config marker hook and implicit static-catalog auth fill were added.Tests and validation
Which commands did you run?
node scripts/run-vitest.mjs src/agents/models-config.applies-config-env-vars.test.tsbefore implementation, expected failure observed in the new regression.node scripts/run-vitest.mjs src/agents/models-config.applies-config-env-vars.test.ts src/agents/embedded-agent-runner/model.test.ts extensions/google/transport-stream.test.tsnode scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.test.src.json --incremental --tsBuildInfoFile /tmp/openclaw-google-vertex-adc-test-src.tsbuildinfonode scripts/run-vitest.mjs extensions/google/index.test.tsnode scripts/run-vitest.mjs src/agents/models-config.applies-config-env-vars.test.ts src/agents/embedded-agent-runner/model.test.ts extensions/google/transport-stream.test.ts extensions/google/index.test.tsnode --import tsxsource-checkout Google provider ADC marker probenode --import tsxsource-checkoutensureOpenClawModelsJsongenerated catalog probenode scripts/run-vitest.mjs src/agents/models-config.providers.implicit.discovery-scope.test.tsnode scripts/run-vitest.mjs extensions/google/index.test.tsgit diff --check.agents/skills/autoreview/scripts/autoreview --mode localWhat regression coverage was added or updated?
google-vertexprovider retains static model rows withgcp-vertex-credentials.GOOGLE_CLOUD_PROJECTandGCLOUD_PROJECTproject env paths.google-vertexrows are filled from ADC auth evidence before writable filtering.What failed before this fix, if known?
google-vertexprovider was filtered out of generated models config.model_not_foundbefore the plugin-owned config marker hook was added.If no test was added, why not?
Risk checklist
Did user-visible behavior change? (
Yes/No)Yes. ADC-backed Google Vertex users should stop seeing runtime
model_not_foundcaused by generated catalog omission.Did config, environment, or migration behavior change? (
Yes/No)Yes, narrowly. Existing ADC env/file evidence can now populate the existing non-secret auth marker in generated provider config; no new config or env surface was added.
Did security, auth, secrets, network, or tool execution behavior change? (
Yes/No)Yes, narrowly in auth config generation. The code only persists known non-secret markers and still avoids writing plaintext env values.
What is the highest-risk area?
How is that risk mitigated?
resolveEnvApiKeyresults only whenisNonSecretApiKeyMarker(..., { includeEnvVarName: false })recognizes the value, while env API keys continue to be represented by env var names via the existingresolveEnvApiKeyVarNamepath.gcp-vertex-credentialsmarker when ADC file, project, and location evidence are all present.Current review state
What is the next action?
What is still waiting on author, maintainer, CI, or external proof?
Which bot or reviewer comments were addressed?