fix: pass cloud provider env vars through to gateway service#62673
fix: pass cloud provider env vars through to gateway service#62673JiaDe-Wu wants to merge 4 commits into
Conversation
fix: pass cloud provider env vars through to gateway service (openclaw#61847)
Greptile SummaryClean, targeted fix that follows the existing proxy-env passthrough pattern exactly — adds Confidence Score: 5/5Safe to merge; both findings are P2 suggestions that do not block correctness. The implementation is correct, the refactor is clean, and all remaining findings are non-blocking style/test suggestions. No files require special attention.
|
| const SERVICE_CLOUD_PROVIDER_ENV_KEYS = [ | ||
| // AWS / Amazon Bedrock | ||
| "AWS_PROFILE", | ||
| "AWS_REGION", | ||
| "AWS_DEFAULT_REGION", | ||
| "AWS_ACCESS_KEY_ID", | ||
| "AWS_SECRET_ACCESS_KEY", | ||
| "AWS_SESSION_TOKEN", | ||
| "AWS_BEARER_TOKEN_BEDROCK", | ||
| // Azure OpenAI | ||
| "AZURE_OPENAI_API_KEY", | ||
| "AZURE_OPENAI_BASE_URL", | ||
| "AZURE_OPENAI_RESOURCE_NAME", | ||
| "AZURE_OPENAI_API_VERSION", | ||
| "AZURE_OPENAI_DEPLOYMENT_NAME_MAP", | ||
| // Google Cloud (Vertex AI / Gemini) | ||
| "GOOGLE_CLOUD_PROJECT", | ||
| "GOOGLE_APPLICATION_CREDENTIALS", | ||
| ] as const; |
There was a problem hiding this comment.
Missing AWS IRSA env vars for EKS deployments
AWS_WEB_IDENTITY_TOKEN_FILE, AWS_ROLE_ARN, and AWS_ROLE_SESSION_NAME are injected by the EKS pod-identity webhook and are the standard mechanism for IRSA (IAM Roles for Service Accounts). Without them, Kubernetes-based deployments will still fail with "No API key found" even after this fix.
| const SERVICE_CLOUD_PROVIDER_ENV_KEYS = [ | |
| // AWS / Amazon Bedrock | |
| "AWS_PROFILE", | |
| "AWS_REGION", | |
| "AWS_DEFAULT_REGION", | |
| "AWS_ACCESS_KEY_ID", | |
| "AWS_SECRET_ACCESS_KEY", | |
| "AWS_SESSION_TOKEN", | |
| "AWS_BEARER_TOKEN_BEDROCK", | |
| // Azure OpenAI | |
| "AZURE_OPENAI_API_KEY", | |
| "AZURE_OPENAI_BASE_URL", | |
| "AZURE_OPENAI_RESOURCE_NAME", | |
| "AZURE_OPENAI_API_VERSION", | |
| "AZURE_OPENAI_DEPLOYMENT_NAME_MAP", | |
| // Google Cloud (Vertex AI / Gemini) | |
| "GOOGLE_CLOUD_PROJECT", | |
| "GOOGLE_APPLICATION_CREDENTIALS", | |
| ] as const; | |
| const SERVICE_CLOUD_PROVIDER_ENV_KEYS = [ | |
| // AWS / Amazon Bedrock | |
| "AWS_PROFILE", | |
| "AWS_REGION", | |
| "AWS_DEFAULT_REGION", | |
| "AWS_ACCESS_KEY_ID", | |
| "AWS_SECRET_ACCESS_KEY", | |
| "AWS_SESSION_TOKEN", | |
| "AWS_BEARER_TOKEN_BEDROCK", | |
| // AWS IRSA (EKS IAM Roles for Service Accounts) | |
| "AWS_WEB_IDENTITY_TOKEN_FILE", | |
| "AWS_ROLE_ARN", | |
| "AWS_ROLE_SESSION_NAME", | |
| // Azure OpenAI | |
| "AZURE_OPENAI_API_KEY", | |
| "AZURE_OPENAI_BASE_URL", | |
| "AZURE_OPENAI_RESOURCE_NAME", | |
| "AZURE_OPENAI_API_VERSION", | |
| "AZURE_OPENAI_DEPLOYMENT_NAME_MAP", | |
| // Google Cloud (Vertex AI / Gemini) | |
| "GOOGLE_CLOUD_PROJECT", | |
| "GOOGLE_APPLICATION_CREDENTIALS", | |
| ] as const; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/daemon/service-env.ts
Line: 70-88
Comment:
**Missing AWS IRSA env vars for EKS deployments**
`AWS_WEB_IDENTITY_TOKEN_FILE`, `AWS_ROLE_ARN`, and `AWS_ROLE_SESSION_NAME` are injected by the EKS pod-identity webhook and are the standard mechanism for IRSA (IAM Roles for Service Accounts). Without them, Kubernetes-based deployments will still fail with "No API key found" even after this fix.
```suggestion
const SERVICE_CLOUD_PROVIDER_ENV_KEYS = [
// AWS / Amazon Bedrock
"AWS_PROFILE",
"AWS_REGION",
"AWS_DEFAULT_REGION",
"AWS_ACCESS_KEY_ID",
"AWS_SECRET_ACCESS_KEY",
"AWS_SESSION_TOKEN",
"AWS_BEARER_TOKEN_BEDROCK",
// AWS IRSA (EKS IAM Roles for Service Accounts)
"AWS_WEB_IDENTITY_TOKEN_FILE",
"AWS_ROLE_ARN",
"AWS_ROLE_SESSION_NAME",
// Azure OpenAI
"AZURE_OPENAI_API_KEY",
"AZURE_OPENAI_BASE_URL",
"AZURE_OPENAI_RESOURCE_NAME",
"AZURE_OPENAI_API_VERSION",
"AZURE_OPENAI_DEPLOYMENT_NAME_MAP",
// Google Cloud (Vertex AI / Gemini)
"GOOGLE_CLOUD_PROJECT",
"GOOGLE_APPLICATION_CREDENTIALS",
] as const;
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Good catch — added AWS_WEB_IDENTITY_TOKEN_FILE, AWS_ROLE_ARN, and AWS_ROLE_SESSION_NAME for EKS IRSA support. Also added test coverage mirroring the proxy env tests.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 80c38fa3ed
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "AWS_ACCESS_KEY_ID", | ||
| "AWS_SECRET_ACCESS_KEY", | ||
| "AWS_SESSION_TOKEN", |
There was a problem hiding this comment.
Exclude short-lived AWS creds from service env passthrough
Passing AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_SESSION_TOKEN through the install-time whitelist bakes whatever credentials are in the installer shell into the persisted service environment. In the gateway install flow, serviceEnvironment is merged last (src/commands/daemon-install-helpers.ts:78-90), so these values override durable .env/config values and the normal IMDS/instance-profile fallback; if the installer used temporary STS creds (common with aws-vault/assume-role), the service will start failing once they expire and keep failing until reinstalled.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Valid concern. A few clarifications:
- Primary use case is
AWS_PROFILE=defaulton EC2/ECS — the SDK resolves credentials dynamically via IMDS on every request, nothing is "baked in". The service env just tells the SDK where to look.- Same design as proxy passthrough —
HTTP_PROXYis also forwarded from the installer's env and could go stale if the proxy moves. This is accepted behavior.~/.openclaw/.envremains the recommended approach for durable configuration (EnvironmentFile takes precedence over inline Environment in systemd).- If a user runs
gateway installinside anaws-vaultsession with temporary STS creds, yes, those would be frozen — but that's a user-side anti-pattern, not something the installer should guard against. The same risk exists for any env var the installer captures.I've added a JSDoc warning on the
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_SESSION_TOKENentries to flag the ephemeral credential caveat.
address review: add EKS IRSA vars, JSDoc warning for ephemeral creds
test: add cloud provider env passthrough tests
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 706b48c672
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "AWS_ACCESS_KEY_ID", | ||
| "AWS_SECRET_ACCESS_KEY", | ||
| "AWS_SESSION_TOKEN", | ||
| "AWS_BEARER_TOKEN_BEDROCK", |
There was a problem hiding this comment.
Drop short-lived AWS secrets from service env passthrough
Persisting AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, and AWS_BEARER_TOKEN_BEDROCK in the install-time whitelist can lock the daemon onto expiring shell credentials (for example aws-vault/STS/SSO sessions) instead of durable runtime auth. In this codebase, service env is merged last and overrides durable .env/config entries (src/commands/daemon-install-helpers.ts:78-90), and AWS auth resolution prefers bearer/static key envs before AWS_PROFILE/default chain (src/agents/model-auth.ts:292-313), so once those captured credentials expire the gateway can fail until reinstall/manual unit edits. Fresh evidence versus earlier discussion: this commit also persists AWS_BEARER_TOKEN_BEDROCK, which is checked first in auth selection.
Useful? React with 👍 / 👎.
|
@steipete @obviyus — Friendly ping on this PR. This is a blocking regression (#61847) since v2026.4.5 that breaks all cloud deployments relying on instance-level credentials (AWS IMDS/instance profiles, Azure managed identity, GCP service accounts). It now has 4 independent confirmations across EC2, Lightsail (#61847 comment), and Google Vertex (#64283). The fix is small and conservative — +50/-2 in @steipete I see you committed to this same file just today ( We maintain sample-OpenClaw-on-AWS-with-Bedrock (370+ stars) and have had to pin users to v2026.3.24 because of this. Would really appreciate a review when you get a chance. 🙏 |
|
Closing this — the underlying issue has been resolved in v2026.4.10 via Thanks @steipete for picking this up. Will verify against our CloudFormation template and unpin from v2026.3.24. For anyone landing here from sample-OpenClaw-on-AWS-with-Bedrock: upgrade to v2026.4.10+ and the |
PR: fix: pass cloud provider env vars through to gateway service
Summary
buildServiceEnvironmentcurates a whitelist of env vars for the systemd/launchd service, but excludes all cloud provider variables (AWS_PROFILE,AWS_REGION,AZURE_OPENAI_*,GOOGLE_CLOUD_*). This breaks every headless cloud deployment where credentials come from instance metadata (IMDS) or instance profiles — the most common setup on AWS, Azure, and GCP.Impact
This is a blocking regression for all AWS Bedrock users who upgrade past 2026.3.24.
We maintain sample-OpenClaw-on-AWS-with-Bedrock, an AWS-published CloudFormation template that deploys OpenClaw on EC2 with Amazon Bedrock. After the switch to
pi-coding-agentin 2026.4.5, every new deployment and every upgrade fails with:Real consequences:
"auth": "aws-sdk"in config) cannot upgrade — we had to pin the version and warn users not to selectlatestplugins.entries.amazon-bedrock.config.auth,models.providers.amazon-bedrock.auth, and pluginenabled: true— none worked because the root cause is the service environment, not the config fileRoot Cause
On EC2 with an IAM instance role:
AWS_PROFILE=default→ AWS SDK discovers credentials via IMDS ✅openclaw gateway installwrites systemd service with curated env → stripsAWS_PROFILE❌pi-coding-agentchecks env vars for cloud credentials → finds nothing → "No API key found"openclaw gateway install --force(upgrades, doctor --fix) overwrites any manualEnvironment=additionsThe
~/.openclaw/.envworkaround works (EnvironmentFile is preserved across reinstalls), but users must discover it themselves — there's no error message pointing to the solution.Solution
Add
SERVICE_CLOUD_PROVIDER_ENV_KEYS— same pattern as the existingSERVICE_PROXY_ENV_KEYS— to pass through cloud provider env vars when present in the host environment:AWS_PROFILE,AWS_REGION,AWS_DEFAULT_REGION,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN,AWS_BEARER_TOKEN_BEDROCKAZURE_OPENAI_API_KEY,AZURE_OPENAI_BASE_URL,AZURE_OPENAI_RESOURCE_NAME,AZURE_OPENAI_API_VERSION,AZURE_OPENAI_DEPLOYMENT_NAME_MAPGOOGLE_CLOUD_PROJECT,GOOGLE_APPLICATION_CREDENTIALSThe implementation reuses the existing env reader pattern (refactored into shared
readServiceKeysFromEnvhelper) and adds one spread inbuildCommonServiceEnvironment. Zero behavioral change for users who don't set these vars.Changes
File:
src/daemon/service-env.ts(+50, -2)SERVICE_CLOUD_PROVIDER_ENV_KEYS— AWS/Azure/Google env var whitelistreadServiceKeysFromEnvshared helper fromreadServiceProxyEnvironmentcloudProviderEnvtoSharedServiceEnvironmentFieldscloudProviderEnvinbuildCommonServiceEnvironmentTesting
AWS_PROFILE=defaultin shell~/.openclaw/.envworkaroundgateway install --forceVerified on: OpenClaw 2026.4.5, EC2 t4g.small (Graviton ARM64), Ubuntu 24.04, Amazon Bedrock Nova 2 Lite.
Operation Steps