Prerequisites
Before configuring Claude Code with Vertex AI, ensure you have:- A Google Cloud Platform (GCP) account with billing enabled
- A GCP project with Vertex AI API enabled
- Access to desired Claude models (for example, Claude Sonnet 4.6)
- Google Cloud SDK (
gcloud) installed and configured - Quota allocated in desired GCP region
If you are deploying Claude Code to multiple users, pin your model versions to prevent breakage when Anthropic releases new models.
Region Configuration
Claude Code can be used with both Vertex AI global and regional endpoints.Vertex AI may not support the Claude Code default models in all regions or on global endpoints. You may need to switch to a supported region, use a regional endpoint, or specify a supported model.
Setup
1. Enable Vertex AI API
Enable the Vertex AI API in your GCP project:2. Request model access
Request access to Claude models in Vertex AI:- Navigate to the Vertex AI Model Garden
- Search for “Claude” models
- Request access to desired Claude models (for example, Claude Sonnet 4.6)
- Wait for approval (may take 24-48 hours)
3. Configure GCP credentials
Claude Code uses standard Google Cloud authentication. For more information, see Google Cloud authentication documentation.When authenticating, Claude Code will automatically use the project ID from the
ANTHROPIC_VERTEX_PROJECT_ID environment variable. To override this, set one of these environment variables: GCLOUD_PROJECT, GOOGLE_CLOUD_PROJECT, or GOOGLE_APPLICATION_CREDENTIALS.4. Configure Claude Code
Set the following environment variables:cache_control ephemeral flag. To disable it, set DISABLE_PROMPT_CACHING=1. For heightened rate limits, contact Google Cloud support. When using Vertex AI, the /login and /logout commands are disabled since authentication is handled through Google Cloud credentials.
5. Pin model versions
Set these environment variables to specific Vertex AI model IDs:| Model type | Default value |
|---|---|
| Primary model | claude-sonnet-4-6 |
| Small/fast model | claude-haiku-4-5@20251001 |
IAM configuration
Assign the required IAM permissions: Theroles/aiplatform.user role includes the required permissions:
aiplatform.endpoints.predict- Required for model invocation and token counting
Create a dedicated GCP project for Claude Code to simplify cost tracking and access control.
1M token context window
Claude Sonnet 4 and Sonnet 4.6 support the 1M token context window on Vertex AI.The 1M token context window is currently in beta. To use the extended context window, include the
context-1m-2025-08-07 beta header in your Vertex AI requests.Troubleshooting
If you encounter quota issues:- Check current quotas or request quota increase through Cloud Console
- Confirm model is Enabled in Model Garden
- Verify you have access to the specified region
- If using
CLOUD_ML_REGION=global, check that your models support global endpoints in Model Garden under “Supported features”. For models that don’t support global endpoints, either:- Specify a supported model via
ANTHROPIC_MODELorANTHROPIC_SMALL_FAST_MODEL, or - Set a regional endpoint using
VERTEX_REGION_<MODEL_NAME>environment variables
- Specify a supported model via
- For regional endpoints, ensure the primary model and small/fast model are supported in your selected region
- Consider switching to
CLOUD_ML_REGION=globalfor better availability