feat(secrets): add HashiCorp Vault KV v2 secret provider#43645
Conversation
Greptile SummaryThis PR introduces a Key issues found:
Confidence Score: 3/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/config/vault-secret-provider.ts
Line: 193-196
Comment:
**Hardcoded Kubernetes role name breaks configurability**
The Kubernetes role name is hardcoded to `"openclaw"`, but Vault Kubernetes auth requires matching the role configured in Vault's auth backend. Any deployment that registers its Kubernetes role under a different name (which is common in shared or multi-tenant Vault setups) will fail to authenticate entirely.
The `VaultConfig` interface has no field for this role, so there is currently no way for users to configure it. A `kubernetesRole` (or similar) field needs to be added to `VaultConfig` and passed here:
```suggestion
const resp = await this.rawRequest("POST", "/v1/auth/kubernetes/login", {
role: this.kubernetesRole ?? "openclaw",
jwt,
});
```
and correspondingly in `VaultConfig`:
```ts
kubernetesRole?: string; // Vault Kubernetes auth role (default: "openclaw")
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/config/secret-resolution.ts
Line: 498-501
Comment:
**Cache TTL sourced exclusively from GCP provider config**
The shared `fetchWithCache` TTL is derived solely from the GCP provider's `cacheTtlSeconds`. When only the Vault provider is configured (no `gcp` key in `secretsConfig.providers`), this expression evaluates to the 5-minute default — completely ignoring any `cacheTtlSeconds` the user set in their `vault` provider config.
This means `${vault:my/secret}` references will always be cached for 5 minutes regardless of what value is configured in the `vault` block, making the vault-specific `cacheTtlSeconds` config option silently ineffective.
Consider picking the TTL from whichever provider is actually in use, or taking the minimum across all configured providers:
```ts
const cacheTtlMs = (() => {
const providerConfigs = Object.values(secretsConfig?.providers ?? {});
const first = providerConfigs.find((c) => c?.cacheTtlSeconds != null);
return first?.cacheTtlSeconds != null ? first.cacheTtlSeconds * 1000 : defaultTtlMs;
})();
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docs/gateway/secrets.md
Line: 489
Comment:
**Supported Providers table not updated**
This PR implements the Vault provider, but the table still marks it as `Planned`. It should be updated to reflect the new status:
```suggestion
| HashiCorp Vault | ✅ Supported |
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: docs/gateway/secrets.md
Line: 460-467
Comment:
**YAML frontmatter accidentally pasted as body content**
These lines appear to be YAML frontmatter metadata (a `summary:` / `read_when:` / `title:` block from another document) that was accidentally included as plain Markdown text. They are rendered as literal prose in the docs page, which looks malformed. This block (and the entire large section added here) also documents the GCP provider rather than the Vault provider being introduced by this PR — it appears content from a separate GCP-focused doc was appended here verbatim.
Consider replacing this content with Vault-specific documentation instead.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 39c244e |
| const resp = await this.rawRequest("POST", "/v1/auth/kubernetes/login", { | ||
| role: "openclaw", | ||
| jwt, | ||
| }); |
There was a problem hiding this comment.
Hardcoded Kubernetes role name breaks configurability
The Kubernetes role name is hardcoded to "openclaw", but Vault Kubernetes auth requires matching the role configured in Vault's auth backend. Any deployment that registers its Kubernetes role under a different name (which is common in shared or multi-tenant Vault setups) will fail to authenticate entirely.
The VaultConfig interface has no field for this role, so there is currently no way for users to configure it. A kubernetesRole (or similar) field needs to be added to VaultConfig and passed here:
| const resp = await this.rawRequest("POST", "/v1/auth/kubernetes/login", { | |
| role: "openclaw", | |
| jwt, | |
| }); | |
| const resp = await this.rawRequest("POST", "/v1/auth/kubernetes/login", { | |
| role: this.kubernetesRole ?? "openclaw", | |
| jwt, | |
| }); |
and correspondingly in VaultConfig:
kubernetesRole?: string; // Vault Kubernetes auth role (default: "openclaw")Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/vault-secret-provider.ts
Line: 193-196
Comment:
**Hardcoded Kubernetes role name breaks configurability**
The Kubernetes role name is hardcoded to `"openclaw"`, but Vault Kubernetes auth requires matching the role configured in Vault's auth backend. Any deployment that registers its Kubernetes role under a different name (which is common in shared or multi-tenant Vault setups) will fail to authenticate entirely.
The `VaultConfig` interface has no field for this role, so there is currently no way for users to configure it. A `kubernetesRole` (or similar) field needs to be added to `VaultConfig` and passed here:
```suggestion
const resp = await this.rawRequest("POST", "/v1/auth/kubernetes/login", {
role: this.kubernetesRole ?? "openclaw",
jwt,
});
```
and correspondingly in `VaultConfig`:
```ts
kubernetesRole?: string; // Vault Kubernetes auth role (default: "openclaw")
```
How can I resolve this? If you propose a fix, please make it concise.| const cacheTtlMs = | ||
| secretsConfig?.providers?.gcp?.cacheTtlSeconds != null | ||
| ? secretsConfig.providers.gcp.cacheTtlSeconds * 1000 | ||
| : defaultTtlMs; |
There was a problem hiding this comment.
Cache TTL sourced exclusively from GCP provider config
The shared fetchWithCache TTL is derived solely from the GCP provider's cacheTtlSeconds. When only the Vault provider is configured (no gcp key in secretsConfig.providers), this expression evaluates to the 5-minute default — completely ignoring any cacheTtlSeconds the user set in their vault provider config.
This means ${vault:my/secret} references will always be cached for 5 minutes regardless of what value is configured in the vault block, making the vault-specific cacheTtlSeconds config option silently ineffective.
Consider picking the TTL from whichever provider is actually in use, or taking the minimum across all configured providers:
const cacheTtlMs = (() => {
const providerConfigs = Object.values(secretsConfig?.providers ?? {});
const first = providerConfigs.find((c) => c?.cacheTtlSeconds != null);
return first?.cacheTtlSeconds != null ? first.cacheTtlSeconds * 1000 : defaultTtlMs;
})();Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/secret-resolution.ts
Line: 498-501
Comment:
**Cache TTL sourced exclusively from GCP provider config**
The shared `fetchWithCache` TTL is derived solely from the GCP provider's `cacheTtlSeconds`. When only the Vault provider is configured (no `gcp` key in `secretsConfig.providers`), this expression evaluates to the 5-minute default — completely ignoring any `cacheTtlSeconds` the user set in their `vault` provider config.
This means `${vault:my/secret}` references will always be cached for 5 minutes regardless of what value is configured in the `vault` block, making the vault-specific `cacheTtlSeconds` config option silently ineffective.
Consider picking the TTL from whichever provider is actually in use, or taking the minimum across all configured providers:
```ts
const cacheTtlMs = (() => {
const providerConfigs = Object.values(secretsConfig?.providers ?? {});
const first = providerConfigs.find((c) => c?.cacheTtlSeconds != null);
return first?.cacheTtlSeconds != null ? first.cacheTtlSeconds * 1000 : defaultTtlMs;
})();
```
How can I resolve this? If you propose a fix, please make it concise.| | ------------------- | ------------ | | ||
| | GCP Secret Manager | ✅ Supported | | ||
| | AWS Secrets Manager | Planned | | ||
| | HashiCorp Vault | Planned | |
There was a problem hiding this comment.
Supported Providers table not updated
This PR implements the Vault provider, but the table still marks it as Planned. It should be updated to reflect the new status:
| | HashiCorp Vault | Planned | | |
| | HashiCorp Vault | ✅ Supported | |
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/gateway/secrets.md
Line: 489
Comment:
**Supported Providers table not updated**
This PR implements the Vault provider, but the table still marks it as `Planned`. It should be updated to reflect the new status:
```suggestion
| HashiCorp Vault | ✅ Supported |
```
How can I resolve this? If you propose a fix, please make it concise.| summary: "External secrets management: store credentials in GCP Secret Manager instead of plaintext files" | ||
| read_when: | ||
|
|
||
| - You want to stop storing API keys in plaintext config files | ||
| - You want per-agent secret isolation in multi-agent setups | ||
| - You want to set up GCP Secret Manager for OpenClaw | ||
| - You want to migrate existing plaintext secrets to a secrets store | ||
| title: "Secrets" |
There was a problem hiding this comment.
YAML frontmatter accidentally pasted as body content
These lines appear to be YAML frontmatter metadata (a summary: / read_when: / title: block from another document) that was accidentally included as plain Markdown text. They are rendered as literal prose in the docs page, which looks malformed. This block (and the entire large section added here) also documents the GCP provider rather than the Vault provider being introduced by this PR — it appears content from a separate GCP-focused doc was appended here verbatim.
Consider replacing this content with Vault-specific documentation instead.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/gateway/secrets.md
Line: 460-467
Comment:
**YAML frontmatter accidentally pasted as body content**
These lines appear to be YAML frontmatter metadata (a `summary:` / `read_when:` / `title:` block from another document) that was accidentally included as plain Markdown text. They are rendered as literal prose in the docs page, which looks malformed. This block (and the entire large section added here) also documents the GCP provider rather than the Vault provider being introduced by this PR — it appears content from a separate GCP-focused doc was appended here verbatim.
Consider replacing this content with Vault-specific documentation instead.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 39c244e67c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| * @param secretsConfig - The `secrets` section from openclaw.json. | ||
| * @param providersOverride - Optional pre-built providers map (useful for testing). | ||
| */ | ||
| export async function resolveConfigSecrets( |
There was a problem hiding this comment.
Wire secret resolution into the config load path
This commit introduces resolveConfigSecrets, but in this revision it is only defined/re-exported and never invoked by the config loading pipeline, so ${vault:...}/${gcp:...} placeholders remain literal strings at runtime. In real configs this causes credentials to stay unresolved and downstream providers to authenticate with placeholder text instead of real secrets. Hook this resolver into config read/normalization before runtime consumers use the config object.
Useful? React with 👍 / 👎.
| const cacheTtlMs = | ||
| secretsConfig?.providers?.gcp?.cacheTtlSeconds != null | ||
| ? secretsConfig.providers.gcp.cacheTtlSeconds * 1000 | ||
| : defaultTtlMs; |
There was a problem hiding this comment.
Use provider-specific cache TTL for secret resolution
The shared cache TTL is derived only from secrets.providers.gcp.cacheTtlSeconds, but fetchWithCache is used for all providers including Vault. That means secrets.providers.vault.cacheTtlSeconds is ignored, so Vault secrets can remain cached for the default 5 minutes even when operators explicitly configure a shorter rotation window. Cache TTL should be selected per provider (or at minimum honor Vault TTL when resolving Vault refs).
Useful? React with 👍 / 👎.
| const resp = await this.rawRequest("POST", "/v1/auth/kubernetes/login", { | ||
| role: "openclaw", | ||
| jwt, |
There was a problem hiding this comment.
Stop hardcoding the Vault Kubernetes auth role
Kubernetes login always sends role: "openclaw", but Vault Kubernetes auth requires the caller to use a role that matches cluster-specific bindings. Any deployment that uses a different role name will fail authentication even with a valid service account JWT. This should come from configuration instead of being fixed to a single role name.
Useful? React with 👍 / 👎.
39c244e to
a3b64c2
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3b64c2880
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (name === "gcp" && config && config.project) { | ||
| providers.set( | ||
| "gcp", | ||
| new GcpSecretProvider( | ||
| config as { project: string; cacheTtlSeconds?: number; credentialsFile?: string }, |
There was a problem hiding this comment.
Align new secret provider config with validated schema
buildSecretProviders now expects secrets.providers.gcp.project / secrets.providers.vault.*, but config validation still requires each secrets.providers.* entry to match the existing SecretProviderSchema (source: env|file|exec in src/config/zod-schema.core.ts:142-155). In practice, configs using the new GCP/Vault shape are rejected before runtime, so these providers cannot be configured through openclaw.json; update the schema/types (or move these settings to a validated section) so the new fields are actually accepted.
Useful? React with 👍 / 👎.
| ): Promise<{ username: string; password: string; lastVaultRotation: string; ttl: number }> { | ||
| const path = `/v1/${mount}/static-creds/${role}`; | ||
| const resp = await this.request("GET", path); | ||
| const data = resp.data?.data as Record<string, string | number> | undefined; |
There was a problem hiding this comment.
Parse static-creds fields from Vault response root data
getStaticCreds reads from resp.data?.data, but Vault static-credentials responses return username/password/ttl directly under data, so this path resolves to undefined and the method returns empty credentials ("", 0) even on successful calls. This causes static-rotation consumers to silently treat valid credentials as missing; parse from resp.data (or use the same top-level fallback pattern used in requestDynamic).
Useful? React with 👍 / 👎.
a3b64c2 to
3f0b2a8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3f0b2a88c2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const resolved = await Promise.all( | ||
| matches.map(async ({ provider: pName, name, version, full }) => { | ||
| const provider = providers.get(pName); | ||
| if (!provider) { | ||
| throw new UnknownSecretProviderError(pName, configPath); | ||
| } | ||
| try { | ||
| const resolvedValue = await fetchWithCache(provider, name, version, cacheTtlMs); |
There was a problem hiding this comment.
Deduplicate identical secret fetches in one resolution pass
Resolving matches with Promise.all means repeated references like ${vault:db/password} in the same config pass are fetched concurrently, so each call sees an empty cache and hits the provider before any result is stored. This defeats the intended "not fetched twice within a single resolution pass" behavior and can cause avoidable Vault/GCP rate-limit pressure or latency spikes on configs that reuse the same secret many times.
Useful? React with 👍 / 👎.
| for (const { full, value: sv } of resolved) { | ||
| working = working.replace(full, sv); |
There was a problem hiding this comment.
Avoid replacing newly inserted secret text as later refs
The loop performs sequential working.replace(full, value) mutations on the evolving string, so if one secret resolves to text equal to another reference token (for example, first value is ${gcp:b}), a later iteration can replace that newly inserted text instead of the original token occurrence. This yields partially unresolved or incorrect final config values for valid secret contents that happen to contain reference-like text.
Useful? React with 👍 / 👎.
97a50e9 to
9dd15ef
Compare
Adds VaultSecretProvider for resolving secrets from HashiCorp Vault.
Uses native `fetch()` — no SDK dependency.
Authentication methods:
- Token auth (VAULT_TOKEN / tokenFile)
- AppRole (roleId + secretId)
- Kubernetes (service account JWT)
Features:
- KV v2 with version pinning: ${vault:path/to/secret#5}
- Namespace support (Vault Enterprise)
- Configurable mount path (default: "secret")
- Lease management for dynamic secrets
- Rotation tracking via KV v2 custom_metadata
- Configurable TTL caching (default 5 min)
Config example:
"secrets": {
"providers": {
"vault": {
"address": "https://vault.example.com",
"authMethod": "token",
"token": "<token>"
}
}
}
Usage: ${vault:my/secret/path}
Split from openclaw#24272 per @vincentkoc's request.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9dd15efeb9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| '@exodus/bytes@1.15.0(@noble/hashes@2.0.1)': | ||
| optionalDependencies: | ||
| '@noble/hashes': 2.0.1 | ||
| '@google-cloud/secret-manager@6.1.1': |
There was a problem hiding this comment.
Restore missing @exodus/bytes lockfile snapshot
This lockfile edit drops the @exodus/bytes@1.15.0(@noble/hashes@2.0.1) snapshot entry that still exists in the parent lockfile graph, and pnpm install --frozen-lockfile now fails immediately with ERR_PNPM_LOCKFILE_MISSING_DEPENDENCY for that package. That blocks CI/clean installs for contributors using frozen lockfile mode, so the lockfile needs to be regenerated or repaired so all referenced snapshots are present.
Useful? React with 👍 / 👎.
| version: string | undefined, | ||
| cacheTtlMs: number, | ||
| ): Promise<string> { | ||
| const cacheKey = `${provider.name}:${name}#${version ?? "latest"}`; |
There was a problem hiding this comment.
Scope secret cache keys by provider configuration
The shared cache key is only provider.name + secret + version, so cached values can bleed across different provider instances that share the same name (for example, two Vault configs with different address/namespace, or two GCP configs with different project/credentials) within the same process. In that scenario a later resolution can return a stale secret from the earlier provider context until TTL expiry, which is incorrect and can expose wrong credentials across contexts.
Useful? React with 👍 / 👎.
9dd15ef to
5f45631
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f456318a3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const path = `/v1/${this.mountPath}/metadata/`; | ||
| // Vault LIST is done via the LIST HTTP method or GET with ?list=true | ||
| try { | ||
| const resp = await this.request("LIST", path); |
There was a problem hiding this comment.
Retry Vault listing with GET when LIST is unsupported
listSecrets only issues the non-standard LIST method and rethrows any non-404 error, even though the code comment notes Vault also supports GET ...?list=true. In deployments behind proxies/load balancers that reject unknown methods (commonly returning 405/501), secret listing will fail despite Vault being reachable, so this should fall back to the GET form before surfacing an error.
Useful? React with 👍 / 👎.
|
Thanks for the contribution! After reviewing this PR alongside the related secrets provider PRs (#43640, #43641, #43643, #43644, #43645, #43646), we need to close these for an architectural rework before they can land. The issue: parallel implementation instead of extending the existing systemOpenClaw already has a mature secrets infrastructure in These PRs introduce a new parallel system — a new There's also a hidden dependency across the PR set: the aws, azure, keyring, and vault branches all import What we need insteadNew cloud-backend providers (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, OS keyring, env vars) should be implemented as resolvers within the existing
We'd love to see these re-submitted with the corrected architecture. |
Summary
Adds
VaultSecretProviderfor resolving secrets from HashiCorp Vault KV v2.Uses native
fetch()— no SDK dependency required.Authentication Methods
VAULT_TOKENenv var ortokenFileroleId+secretId/var/run/secrets/kubernetes.io/serviceaccount/tokenFeatures
${vault:path/to/secret#5}secret)Config
{ "secrets": { "providers": { "vault": { "address": "https://vault.example.com", "authMethod": "approle", "roleId": "<role-id>", "secretId": "<secret-id>" } } } }Usage:
${vault:my/secret/path}Notes
Split from #24272 per @vincentkoc's request.