fix(google-vertex): add Google Vertex AI onboarding wizard, fix ADC auth, add provider docs#87800
fix(google-vertex): add Google Vertex AI onboarding wizard, fix ADC auth, add provider docs#87800koverholt wants to merge 33 commits into
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 1, 2026, 3:07 PM ET / 19:07 UTC. Summary PR surface: Source +276, Tests +8, Docs +270. Total +554 across 17 files. Reproducibility: yes. from source and supplied proof: current main still has stricter google-vertex ADC gating and fallback routing gaps, while the PR shows a real GCE VM onboarding and response path after the change; I did not rerun a live GCE VM in this read-only review. Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land only after maintainers accept the auth-gate, default-config, and routing tradeoffs as the google-vertex provider contract, or narrow the patch to an approved provider-auth seam before merge. Do we have a high-confidence way to reproduce the issue? Yes, from source and supplied proof: current main still has stricter google-vertex ADC gating and fallback routing gaps, while the PR shows a real GCE VM onboarding and response path after the change; I did not rerun a live GCE VM in this read-only review. Is this the best way to solve the issue? Yes, with maintainer acceptance: the Google-plugin onboarding/docs and generic marker evidence seam fit the existing ownership model, but the relaxed auth gate and provider-specific core fallback are compatibility-sensitive choices rather than automatic cleanup. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 65a805ac2878. Label changesLabel justifications:
Evidence reviewedPR surface: Source +276, Tests +8, Docs +270. Total +554 across 17 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
Addressed the ClawSweeper review findings:
All CI checks pass. @clawsweeper re-review |
|
🦞👀 Command router queued. I will update this comment with the next step. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Addressed the remaining finding:
All CI checks pass. All review findings addressed. @clawsweeper re-review |
Narrow hasLocalFileAuthEvidence parameter type to the "local-file-with-env" variant of the ProviderAuthEvidence union so TypeScript can see fileEnvVar and fallbackPaths properties. Cast the providers assignment in applyPrimaryModel to OpenClawConfig["models"] to satisfy the Record<string, ModelProviderConfig> constraint.
…t default Add a static model catalog for the google-vertex provider, following the same pattern as the google (AI Studio) provider's runtime catalog. The catalog includes: - Auto-updating latest aliases: gemini-flash-latest (default), gemini-pro-latest, gemini-flash-lite-latest - Shared Gemini text models from the existing google catalog: gemini-3.1-pro-preview, gemini-3-flash-preview The latest aliases always point to the latest stable model in each family and are recommended for most users. gemini-flash-latest is tagged as the default model and is set as the wizard default for new google-vertex setups. Update the docs page to document both latest aliases (recommended) and specific version model IDs.
Change the google-vertex setup.providers authMethods from "api-key" to "adc" so the auto-generated wizard choice matches our ADC auth method. Previously the manifest declared authMethods: ["api-key"] but the google-vertex provider had no auth method with id "api-key" (and the original provider had auth: [] entirely). The wizard generated a choice ID "google-vertex-api-key" that matched nothing, so selecting "Google Vertex" in onboarding skipped the auth flow entirely. Now the wizard generates "google-vertex-adc" which matches our ADC auth method, triggering project auto-detection and config writing.
Move the google-vertex static model catalog from the TypeScript runtime catalog (provider-catalog.ts / provider-discovery.ts) to the manifest JSON (openclaw.plugin.json) with discovery: "static". The model picker's "Browse all models" reads from manifest JSON via planManifestModelCatalogRows, not from the staticCatalog.run callback. The runtime catalog approach only served the "openclaw models list" CLI command, leaving the model picker empty. Moving to manifest JSON serves both paths. Models: gemini-flash-latest (default), gemini-pro-latest, gemini-flash-lite-latest, gemini-3.1-pro-preview, gemini-3-flash-preview. Remove buildGoogleVertexStaticCatalogProvider() and revert provider-discovery.ts to only return google (AI Studio) models. No duplication, single source of truth in the manifest.
Add models.providers["google-vertex"].models to the auth flow's configPatch so the default model (gemini-flash-latest) is registered in the provider config when the config is written to disk. The applyPrimaryModel path correctly sets models.providers in memory but the value is lost during the wizard's config serialization. The configPatch path persists reliably (proven by env values persisting through the same mechanism).
Return a gcp-vertex-credentials marker profile from the google-vertex auth flow instead of empty profiles. This makes the auth detection system recognize google-vertex as having valid auth, which fixes: 1. The "No auth configured" warning during onboarding model check 2. The model picker auth filter dropping all google-vertex catalog models (only the configured model was visible) Add nonSecretAuthMarkers: ["gcp-vertex-credentials"] to the google plugin manifest so the auth system treats the marker as a non-secret auth token. This matches the anthropic-vertex plugin which uses the same marker for the same purpose (ADC-based auth with no real API key). The transport layer recognizes the marker via isGoogleVertexCredentialsMarker() and resolves a real Bearer token via google-auth-library at request time.
Add an explicit providerAuthChoices entry for google-vertex to override
the auto-generated label. The auto-generation derives the label from the
provider id ("google-vertex" → "Google Vertex") which drops the "AI"
suffix. The explicit entry sets groupLabel to "Google Vertex AI" to
match the full product name.
…ss-references Update the google-vertex docs page: - Add models.providers section to the config example to match what onboarding actually writes - Add Verify step to the service account tab - Clarify that GOOGLE_CLOUD_PROJECT is auto-detected during onboarding on GCE/GKE - Note that gemini-flash-latest is the default model set by onboarding - Clarify that the OpenAI routing issue only affects custom configs Add cross-references between google.md and google-vertex.md: - Add a Note on the google.md page pointing users to google-vertex for GCP project billing and credits - Add a Related card on google.md linking to google-vertex.md
…g entry The transport type defaults to google-vertex automatically (commit 7ba19c8). No user going through the normal onboarding flow would encounter this issue.
…ormatting - Change GOOGLE_APPLICATION_CREDENTIALS default from "Auto-detected" to "None" since the env var is not auto-detected (the system falls back to the default gcloud ADC file path, but that is a file lookup, not env var detection) - Add guidance to the "No API key found" troubleshooting entry noting this should not appear after onboarding and suggesting re-running onboarding if it does - Remove stray blank line before closing AccordionGroup tag
Add providers/google-vertex to the Mintlify docs.json navigation config so the page appears in the sidebar between Google (Gemini) and Gradium.
- Remove AI Studio comparison from intro paragraph and credits note. The page should describe what google-vertex is, not compare it to another provider. - Rename "Local development" tab to "gcloud CLI" to match the credential type naming pattern used by other tabs. No other provider docs page uses "Local development" as a tab title. - Fix Related card title from "Google (AI Studio)" to "Google (Gemini)" to match the actual page title of google.md.
- Update frontmatter summary to use "gcloud CLI" instead of "local dev" - Fix "Best for" on gcloud CLI tab to remove "developing" assumption - Remove "Model names are the same as on Google AI Studio" claim - Update troubleshooting to say "With gcloud CLI" instead of "On local dev"
Replace "For local development" with "With gcloud CLI" in the wizard provider notes to match the docs tab naming convention.
The configPatch models.providers entry omits baseUrl (which is required by ModelProviderConfig) because google-vertex constructs URLs dynamically from project/location. Cast the configPatch as Partial<OpenClawConfig> so TypeScript accepts the partial provider config.
The stricter tsgo type checker rejects a direct cast from the configPatch object literal to Partial<OpenClawConfig> because the models.providers entry omits baseUrl. Use the standard unknown intermediate cast pattern.
- Expand auth bullet to list credential sources (metadata server, gcloud CLI, service account key) instead of just saying "ADC" - Remove API key comparison from read_when frontmatter since this page does not cover API key / Express Mode auth - Change "auth method" to "setup" since the three tabs are different credential sources for the same auth method, not different methods
The project's lint rules require curly braces around if bodies.
Run pnpm format:docs to fix table column alignment and add required blank lines before closing Mintlify component tags.
The test expected getEnvApiKey("google-vertex") to return undefined when
GOOGLE_CLOUD_PROJECT is set but no credentials file exists. With the
relaxed auth gate, GOOGLE_CLOUD_PROJECT alone is sufficient to return
"<authenticated>" since the downstream transport handles actual
credential resolution at request time via google-auth-library.
The missing-model registration hint now includes both id and name fields (commit 0152bcc). Update the test assertion to expect the new format.
Update the plugin manifest docs to describe the new env-vars-with-marker auth evidence type alongside the existing local-file-with-env type. Explains when to use each: local-file-with-env for credential files on disk, env-vars-with-marker for ambient credential sources like GCE metadata server or workload identity.
…istration Fix two issues found by ClawSweeper re-review: 1. The configPatch in the google-vertex auth flow now reads existing models.providers["google-vertex"].models and merges the default model instead of replacing the array. Re-running onboarding no longer drops user-registered Vertex models. 2. applyPrimaryModel now only auto-registers models in models.providers for providers that already have a config entry. This prevents creating invalid provider config (missing baseUrl) for unknown custom providers when a user sets a manual slash-prefixed model.
Move the google-vertex API transport default from the core model resolver (resolveProviderNameDefaultApi) into the Google plugin's normalizeTransport hook. When the provider is google-vertex and no api is explicitly set, the plugin now returns api: google-vertex. This keeps provider-specific routing in the owning plugin instead of the generic core resolver, matching the architectural pattern for other provider plugins.
… path Commit e4c0dd2 moved the google-vertex transport default from resolveProviderNameDefaultApi in the core model resolver into the Google plugin normalizeTransport hook. The hook only fires when api is null/undefined, but the two fallback paths in model.ts hardcode "openai-responses" as the default before the hook runs. This caused google-vertex requests to route through the OpenAI transport, sending the "gcp-vertex-credentials" marker as a literal API key to platform.openai.com. Restore resolveProviderNameDefaultApi in the core fallback chains so it runs before the "openai-responses" default. Keep the plugin hook as a redundant safety net for the discovered-model path.
Add inline comments at the three locations flagged by review: - env-api-keys.ts: explain the intentionally permissive auth gate and where invalid setups fail (request time, not gate time) - provider-contract-api.ts: document that onboarding writes are additive and reruns preserve existing models - model-auth.profiles.test.ts: explain why the missing-explicit-path rejection test still passes with env-vars-with-marker evidence
|
Rebased onto latest Three files had conflicts with recent upstream changes (PR #88512 and direct maintainer commits):
All CI checks pass (136/136). Re-tested the full onboarding flow with OpenClaw running on a VM after the rebase: project auto-detection, ADC auth, model catalog, and Gemini response all working. @clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. |
|
@clawsweeper re-review |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
Summary
What problem does this PR solve?
The
google-vertexprovider does not work end-to-end for developers using Application Default Credentials (ADC). The auth gate requires a credentials file on disk, rejecting metadata server ADC (GCE/GKE/Cloud Run) before the transport layer can run. The onboarding wizard has no auth flow for google-vertex, doesn't register models in the provider config, doesn't prompt for GCP project/location, has no model catalog, and has no dedicated documentation.Why does this matter now?
Multiple open issues document this problem (#56253, #79595, #85864, #79837, #11413). We reproduced the issue on a fresh install: a new OpenClaw install requires numerous manual undocumented steps and manual edits to JSON config to get a working
google-vertexsetup. Users on GCE/GKE with valid GCP credentials consistently hit "No API key found" because the auth gate only accepts file-based ADC.What is the intended outcome?
A developer picks "Google Vertex AI" in the onboarding wizard, confirms their auto-detected GCP project, and sends a message. Three steps without manual config editing.
What is intentionally out of scope?
google(AI Studio) orgoogle-gemini-cliprovidersWhat does success look like?
openclaw onboard --auth-choice google-vertex-adcon a GCE VM (or any machine with ADC configured) produces a working setup with zero manual config editing.What should reviewers focus on?
"env-vars-with-marker"auth evidence type (src/secrets/provider-env-vars.ts,src/plugins/manifest.ts,src/agents/model-auth-env.ts)src/llm/env-api-keys.ts(accepts project OR credentials file OR credentials env var, no longer requires all three)configPatchmodel registration approach inextensions/google/provider-contract-api.ts(workaround forapplyPrimaryModelnot persistingmodels.providersthrough the wizard config serialization)gcp-vertex-credentialsmarker profile stored during onboarding for auth detectionAll changes were reviewed and tested by the author on a fresh VM.
Linked context
Which issue does this close?
Related #56253, #85864, #79595, #79837, #11413
Which issues, PRs, or discussions are related?
Related #49191, #50053, #52476, #48910, #77643, #55572, #53566, #9729
See also #83971 (merged, fixed transport-layer ADC but not the upstream auth gate)
Was this requested by a maintainer or owner?
No. Reproduced independently on a GCE VM based on multiple open issues reporting the same auth failure.
Real behavior proof (required for external PRs)
gemini-flash-latestset and registered in provider config.vertex-adc.ts) which is unchanged; only the upstream auth gate and wizard flow are new.Tests and validation
Which commands did you run?
End-to-end onboarding and API call on GCE VM validated across multiple iterations.
What regression coverage was added or updated?
No new unit tests. Existing tests for
hasVertexAdcCredentials,resolveAuthEvidence,normalizeManifestSetupProviderAuthEvidence, and the google extension transport/auth tests (293 tests) cover the touched code paths and all passed.What failed before this fix, if known?
"No API key found for provider google-vertex" on any environment using metadata server ADC (GCE, GKE, Cloud Run). Also: "Unknown model" after onboarding (wizard didn't register models in provider config), requests routed to OpenAI endpoints (wrong default transport type), empty model catalog in "Browse all models," no auth flow during onboarding.
If no test was added, why not?
The changes span auth gating, manifest parsing, wizard flow, model resolution, and transport defaults across multiple subsystems. Real behavior proof from a GCE VM is more representative than unit tests for this class of end-to-end configuration issue.
Risk checklist
Did user-visible behavior change? Yes. New wizard auth flow for google-vertex (project auto-detection, location default, model catalog). New default model
gemini-flash-latest. "Google Vertex AI" label in wizard (was "Google Vertex"). New dedicated docs page.Did config, environment, or migration behavior change? Yes. New
"env-vars-with-marker"auth evidence type in the plugin manifest system.GOOGLE_CLOUD_LOCATIONno longer required (defaults to"global"). Auth gate acceptsGOOGLE_CLOUD_PROJECTalone without requiring a credentials file on disk.Did security, auth, secrets, network, or tool execution behavior change? Yes. The auth gate is more permissive: accepts project env var OR credentials file OR credentials env var (previously required all three of: file + project + location). The
gcp-vertex-credentialsmarker is stored as an auth profile credential during onboarding.nonSecretAuthMarkersadded to the google plugin manifest.What is the highest-risk area?
The relaxed auth gate in
src/llm/env-api-keys.ts. Previously requiredhasCredentials && hasProject && hasLocation. Now acceptshasProject || hasCredentials || hasCredentialsEnv.How is that risk mitigated?
The downstream transport (
extensions/google/vertex-adc.ts) validates credentials at request time viagoogle-auth-library'sGoogleAuth. If ADC is not actually available, the request fails with a clear error ("Google Vertex ADC fallback did not return an access token") rather than the opaque "No API key found" gate error. The gate change only affects whether the request reaches the transport layer, not whether it succeeds.Current review state
What is the next action?
Ready for maintainer review.
What is still waiting on author, maintainer, CI, or external proof?
Nothing blocking.
pnpm test:extension google(293 tests) andpnpm check(typecheck + lint) passed cleanly.Which bot or reviewer comments were addressed?
N/A (new PR).