Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
Any call to saveAuthProfileStore (in src/agents/auth-profiles/store.ts) clobbers the runtime auth-profile snapshot with an external-CLI-filtered view, silently dropping the anthropic:claude-cli OAuth credential from in-process state while leaving the on-disk file intact — every subsequent embedded-agent run then fails with FailoverError: No credentials found for profile "anthropic:claude-cli" until the openclaw models ... CLI bootstrap re-runs.
Steps to reproduce
- Start OpenClaw with an
anthropic:claude-cli OAuth profile active and the embedded agent responding normally.
- Trigger any call to
saveAuthProfileStore — confirmed triggers include:
- A channel-secrets reload after mutating
openclaw.json (e.g., SecretRef migration of channels.discord.token)
- Any
openclaw update that rewrites config on boot
- Observe gateway logs:
config change detected → channel reload → Secrets reloaded.
- Submit any embedded-agent request:
FailoverError: No credentials found for profile "anthropic:claude-cli".
- Confirm
~/.openclaw/agents/main/agent/auth-profiles.json is still present on disk with valid credentials.
- Run
openclaw models status (forces auth-profiles bootstrap); restart gateway: agent recovers.
Reproduced twice on v2026.5.18 (2026-05-19 and 2026-05-20) on Pop!_OS 24.04.
Expected behavior
Any saveAuthProfileStore call that writes auth-profiles.json should either preserve or immediately re-attach external-CLI OAuth profiles in the runtime snapshot. The overlayExternalAuthProfiles bootstrap path that openclaw models ... triggers should also run on the gateway's own write path.
Actual behavior
After saveAuthProfileStore writes, the runtime credential reference for anthropic:claude-cli is empty. Subsequent calls to ensureAuthProfileStore → resolveRuntimeAuthProfileStore return the stripped snapshot (external profiles absent) without re-reading disk. overlayExternalAuthProfiles is bypassed on the runtime fast-path. Result: every embedded-agent run fails with No credentials found for profile "anthropic:claude-cli" until the CLI bootstrap re-runs.
Root cause (traced to source)
In src/agents/auth-profiles/store.ts, saveAuthProfileStore ends with:
const localStore = buildLocalAuthProfileStoreForSave({ store, agentDir, options });
saveJsonFile(authPath, buildPersistedAuthProfileSecretsStore(localStore, ...));
savePersistedAuthProfileState(localStore, agentDir);
writeCachedAuthProfileStore({ authPath, ..., store: localStore });
if (hasRuntimeAuthProfileStoreSnapshot(agentDir))
setRuntimeAuthProfileStoreSnapshot(localStore, agentDir); // ← bug
buildLocalAuthProfileStoreForSave intentionally filters out external-CLI profiles for disk persistence (correct — those credentials are owned by the external CLI). But the same filtered localStore is then written to the runtime snapshot via setRuntimeAuthProfileStoreSnapshot. Subsequent credential lookups go through ensureAuthProfileStore → resolveRuntimeAuthProfileStore, which short-circuits on the cached snapshot without re-reading disk. The overlayExternalAuthProfiles re-attachment is bypassed at runtime (suspected: runtime-fast-path-BLTCPu20.js bundle path — not yet fully traced).
Fix direction: setRuntimeAuthProfileStoreSnapshot should receive the full store (with external profiles preserved) rather than the disk-filtered localStore. Alternatively, resolveRuntimeAuthProfileStore should always call overlayExternalAuthProfiles before returning.
OpenClaw version
2026.5.18
Operating system
Pop!_OS 24.04 (Linux 6.18.7-76061807-generic x64)
Install method
npm global
Model
anthropic/claude-sonnet-4-6
Provider / routing chain
openclaw → anthropic:claude-cli (OAuth, external-CLI profile)
Additional provider/model setup details
anthropic:claude-cli is the OAuth profile managed by the Claude CLI at ~/.claude/.credentials.json. OpenClaw uses it as an external-CLI profile overlay. On-disk: ~/.openclaw/agents/main/agent/auth-profiles.json. The openclaw security audit flags this profile as [LEGACY_RESIDUE] after the bug fires — the disk file survives but the in-memory reference is gone.
Logs, screenshots, and evidence
# Gateway log sequence (2026-05-19 observed):
config change detected; evaluating reload (channels.discord.token)
config change requires channel reload (discord)
Secrets reloaded.
# ... 8h later, first embedded-agent request:
FailoverError: No credentials found for profile "anthropic:claude-cli"
# Recovery sequence:
$ openclaw models status
# expiring expires in 8h ← bootstrap re-runs from ~/.claude/.credentials.json
$ systemctl --user restart openclaw-gateway
# wait ~8s
$ curl -s http://localhost:18789/health
{"ok":true,"status":"live"}
# Agent recovers; responds normally to next message
Note on related issues: Upstream #85125 is a /models perf prewarm task and is not related to this bug. Commit a483f70a clears the prepared provider-auth map on auth failure — a separate in-memory cache from the runtime auth-profile snapshot this bug clobbers; its relevance to our specific failure is unverified.
v5.17 shipped Auth: serialize provider login writes through the auth-profile lock so a live Gateway cannot overwrite freshly refreshed OAuth credentials with an expired in-memory snapshot — we're on 5.18 (includes it) yet still see this bug, indicating the dominant trigger is the saveAuthProfileStore snapshot clobber on config-write, not an OAuth-refresh race.
v5.19 ships Gateway/config: keep config writes from failing on unrelated unresolved auth-profile SecretRefs while preserving live auth-profile runtime snapshots — directly relevant to the config-write trigger. Upgrading to ≥5.20 is expected to close the most common trigger but the root-cause saveAuthProfileStore snapshot bug should still be fixed at the source.
Impact and severity
- Affected: Any OpenClaw instance using
anthropic:claude-cli (or other external-CLI OAuth profile) when a config write triggers saveAuthProfileStore
- Severity: High — blocks all embedded-agent runs; gateway appears healthy (
/health returns ok) while agent is silently broken
- Frequency: Triggered on every config-mutating event (channel reload,
openclaw update); intermittent from user perspective because the gateway stays up
- Consequence: Agent stops responding until a manual
openclaw models status + gateway restart recovery sequence is performed; silent failure is particularly dangerous in unattended setups
Additional information
Tracking issue in our fleet: cobenrogers/mission-control#7
Local workaround (documented for our fleet):
openclaw models status (forces auth-profiles bootstrap from ~/.claude/.credentials.json)
systemctl --user restart openclaw-gateway
- Wait ~8s for HTTP rebind; verify
curl -s http://localhost:18789/health returns {"ok":true,"status":"live"}
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
Any call to
saveAuthProfileStore(insrc/agents/auth-profiles/store.ts) clobbers the runtime auth-profile snapshot with an external-CLI-filtered view, silently dropping theanthropic:claude-cliOAuth credential from in-process state while leaving the on-disk file intact — every subsequent embedded-agent run then fails withFailoverError: No credentials found for profile "anthropic:claude-cli"until theopenclaw models ...CLI bootstrap re-runs.Steps to reproduce
anthropic:claude-cliOAuth profile active and the embedded agent responding normally.saveAuthProfileStore— confirmed triggers include:openclaw.json(e.g., SecretRef migration ofchannels.discord.token)openclaw updatethat rewrites config on bootconfig change detected → channel reload → Secrets reloaded.FailoverError: No credentials found for profile "anthropic:claude-cli".~/.openclaw/agents/main/agent/auth-profiles.jsonis still present on disk with valid credentials.openclaw models status(forces auth-profiles bootstrap); restart gateway: agent recovers.Reproduced twice on v2026.5.18 (2026-05-19 and 2026-05-20) on Pop!_OS 24.04.
Expected behavior
Any
saveAuthProfileStorecall that writesauth-profiles.jsonshould either preserve or immediately re-attach external-CLI OAuth profiles in the runtime snapshot. TheoverlayExternalAuthProfilesbootstrap path thatopenclaw models ...triggers should also run on the gateway's own write path.Actual behavior
After
saveAuthProfileStorewrites, the runtime credential reference foranthropic:claude-cliis empty. Subsequent calls toensureAuthProfileStore → resolveRuntimeAuthProfileStorereturn the stripped snapshot (external profiles absent) without re-reading disk.overlayExternalAuthProfilesis bypassed on the runtime fast-path. Result: every embedded-agent run fails withNo credentials found for profile "anthropic:claude-cli"until the CLI bootstrap re-runs.Root cause (traced to source)
In
src/agents/auth-profiles/store.ts,saveAuthProfileStoreends with:buildLocalAuthProfileStoreForSaveintentionally filters out external-CLI profiles for disk persistence (correct — those credentials are owned by the external CLI). But the same filteredlocalStoreis then written to the runtime snapshot viasetRuntimeAuthProfileStoreSnapshot. Subsequent credential lookups go throughensureAuthProfileStore → resolveRuntimeAuthProfileStore, which short-circuits on the cached snapshot without re-reading disk. TheoverlayExternalAuthProfilesre-attachment is bypassed at runtime (suspected:runtime-fast-path-BLTCPu20.jsbundle path — not yet fully traced).Fix direction:
setRuntimeAuthProfileStoreSnapshotshould receive the fullstore(with external profiles preserved) rather than the disk-filteredlocalStore. Alternatively,resolveRuntimeAuthProfileStoreshould always calloverlayExternalAuthProfilesbefore returning.OpenClaw version
2026.5.18
Operating system
Pop!_OS 24.04 (Linux 6.18.7-76061807-generic x64)
Install method
npm global
Model
anthropic/claude-sonnet-4-6
Provider / routing chain
openclaw → anthropic:claude-cli (OAuth, external-CLI profile)
Additional provider/model setup details
anthropic:claude-cliis the OAuth profile managed by the Claude CLI at~/.claude/.credentials.json. OpenClaw uses it as an external-CLI profile overlay. On-disk:~/.openclaw/agents/main/agent/auth-profiles.json. Theopenclaw security auditflags this profile as[LEGACY_RESIDUE]after the bug fires — the disk file survives but the in-memory reference is gone.Logs, screenshots, and evidence
Note on related issues: Upstream #85125 is a
/modelsperf prewarm task and is not related to this bug. Commita483f70aclears the prepared provider-auth map on auth failure — a separate in-memory cache from the runtime auth-profile snapshot this bug clobbers; its relevance to our specific failure is unverified.v5.17 shipped
Auth: serialize provider login writes through the auth-profile lock so a live Gateway cannot overwrite freshly refreshed OAuth credentials with an expired in-memory snapshot— we're on 5.18 (includes it) yet still see this bug, indicating the dominant trigger is thesaveAuthProfileStoresnapshot clobber on config-write, not an OAuth-refresh race.v5.19 ships
Gateway/config: keep config writes from failing on unrelated unresolved auth-profile SecretRefs while preserving live auth-profile runtime snapshots— directly relevant to the config-write trigger. Upgrading to ≥5.20 is expected to close the most common trigger but the root-causesaveAuthProfileStoresnapshot bug should still be fixed at the source.Impact and severity
anthropic:claude-cli(or other external-CLI OAuth profile) when a config write triggerssaveAuthProfileStore/healthreturnsok) while agent is silently brokenopenclaw update); intermittent from user perspective because the gateway stays upopenclaw models status+ gateway restart recovery sequence is performed; silent failure is particularly dangerous in unattended setupsAdditional information
Tracking issue in our fleet: cobenrogers/mission-control#7
Local workaround (documented for our fleet):
openclaw models status(forces auth-profiles bootstrap from~/.claude/.credentials.json)systemctl --user restart openclaw-gatewaycurl -s http://localhost:18789/healthreturns{"ok":true,"status":"live"}