Skip to content

fix(auth): auto-migrate keychain-only legacy Codex OAuth profiles on first interactive CLI run#85163

Closed
RomneyDa wants to merge 14 commits into
mainfrom
codex-oauth-keychain-migration
Closed

fix(auth): auto-migrate keychain-only legacy Codex OAuth profiles on first interactive CLI run#85163
RomneyDa wants to merge 14 commits into
mainfrom
codex-oauth-keychain-migration

Conversation

@RomneyDa

@RomneyDa RomneyDa commented May 22, 2026

Copy link
Copy Markdown
Member

Summary

Closes the remaining bucket from #85083: macOS users whose legacy oauthRef sidecar seed lives only in the Keychain still had to know to run openclaw doctor --fix after upgrade, because the embedded-runtime self-heal that landed in #85074 intentionally runs with allowKeychainPrompt: false.

The fix adds a tiny pre-command trigger in runCli (src/cli/auto-migrate-legacy-oauth-sidecar.ts) that, on darwin + TTY only, silently reuses the existing maybeRepairLegacyOAuthSidecarProfiles doctor flow when a sidecar dir is present. There is no in-CLI confirm prompt — the macOS Keychain dialog is the only consent surface. On Allow: migration completes inline and the original command proceeds. On Deny or any decryption failure: a permanent decline marker (OPENCLAW_STATE_DIR/legacy-oauth-sidecar-migration-declined) is written so the next CLI run does not re-trigger the OS dialog; openclaw doctor --fix is the explicit retry path. OPENCLAW_AUTO_MIGRATE_LEGACY_OAUTH_SIDECAR=0 is an explicit opt-out for power users.

Headless paths (cron, systemd, Telegram polling, embedded sub-agent dispatch) still skip the auto-heal entirely — but the legacy sidecar loader now emits a one-shot log.warn that names openclaw doctor --fix and macOS Keychain, instead of letting the credential silently fall through to a downstream No API key found for provider "openai-codex".

Also drops the user-confusing sidecar-backed phrasing from the doctor prompt/notes/changes; existing internal identifier names are left intact.

Approach

Approach B from the issue (auto-trigger doctor migration on first interactive openclaw <anything> invocation), but silent: no clack confirm, no two-step "we're about to prompt you for Keychain access" UX. The macOS OS dialog itself is the explicit consent surface — clicking Allow once both authorizes the binary and triggers migration in the same step. Architecturally this leaves "when is it safe to prompt for Keychain access?" as one decision in one place (the embedded runtime path stays headless; the CLI path lets the OS prompt).

Skip rules in order: process.platform !== "darwin"!isInteractiveTty (no Keychain dialog on cron/headless) → CI=true || OPENCLAW_NON_INTERACTIVE=1 || OPENCLAW_AUTO_MIGRATE_LEGACY_OAUTH_SIDECAR=0 || OPENCLAW_AUTH_STORE_READONLY=1--help/--version → primary is doctor/update/help/completion/version--json/--json=… (anywhere in argv before --, keeps clean output) → sidecar dir empty → fresh decline marker. The hook is dynamically imported only on darwin (gated at the call site in runCli).

Skip rules deliberately removed from earlier revisions: the auto-prompt design had a growing skip list for --non-interactive/--yes/--no-input/--force and channels remove --delete — the entire treadmill was about avoiding a clack prompt that no longer exists. With the OS Keychain dialog as the only consent surface, those flag-based skips are unnecessary; the OS dialog appears at most once per machine (Allow → never again; Deny → marker → never again).

Headless one-shot warning

Added in legacy-oauth-sidecar.ts. Fires when loadLegacyOAuthSidecarMaterial returns null on darwin with allowKeychainPrompt: false (i.e., env/file seeds were tried and Keychain was blocked). Rate-limited to one warn per process so embedded turns log it once per gateway lifetime, not per attempt. Guarded by the same VITEST env signals as the keychain read so unrelated test runs don't surface it.

Why this is safe vs. existing inline credentials

  • Re-signed-in profiles are invisible to the migration. writeOAuthCredentials (provider-auth-helpers.ts:297) builds a fresh credential object with no oauthRef, and upsertAuthProfile (profiles.ts:139) does store.profiles[id] = credential — full replacement, not merge. So any user who re-signed in via normal flow has no oauthRef on their profile, and the migration scan (resolveLegacyOAuthSidecarStore, doctor-auth-oauth-sidecar.ts:117-126) only includes profiles where value.oauthRef is a valid LegacyOAuthRef. Migration is a no-op for them.
  • Runtime resolution already prefers inline material. persisted.ts:274-279 checks hasInlineOAuthTokenMaterial first; if inline tokens are present, the legacy sidecar resolver is never called regardless of what oauthRef says.
  • Sidecar credentials are real OAuth tokens. Migration just moves them from "encrypted file referenced via oauthRef" to "inline access/refresh/idToken." Same values, same refresh cycle. If the refresh token still works → user is healed. If it's already revoked → user re-signs in via normal flow; migration never makes auth worse.

Regression tests

  • src/cli/auto-migrate-legacy-oauth-sidecar.test.ts — table-driven preflight skip checks for each rule (platform/TTY/env opt-outs/CI/readonly/no-sidecar/primary skips/--json variants), plus the silent happy path (sidecar removed, inline tokens written, no marker), unreferenced-sidecar bail-out, and decryption-failure → marker → second-run-skips. Asserts behavior (file state), not internal prompter calls.
  • src/agents/auth-profiles/legacy-oauth-sidecar.test.ts — exercises the headless warn path: emit once when blocked on darwin with no env/file seeds; never emit on non-darwin.

Verification

  • node scripts/run-vitest.mjs src/cli/auto-migrate-legacy-oauth-sidecar.test.ts src/agents/auth-profiles/legacy-oauth-sidecar.test.ts src/commands/doctor-auth-oauth-sidecar.test.ts src/commands/doctor/repair-sequencing.test.ts — 53/53 green across 5 files.
  • node scripts/run-oxlint.mjs clean on all touched files; oxfmt --check clean.

Scope notes

  • Did not touch agents.add.test.ts / stale-oauth-profile-shadows.test.ts test descriptions that still contain sidecar-backed; those are dev-facing and reference implementation, not UI copy.
  • Did not externalize OPENCLAW_AUTO_MIGRATE_LEGACY_OAUTH_SIDECAR to docs anywhere except docs/gateway/doctor.md accordion 5 since it is an escape hatch, not a normal-user knob.
  • No changes to maybeRepairLegacyOAuthSidecarProfiles itself beyond the wording rename — the migration semantics are intentionally identical between the doctor-driven and CLI-startup-driven paths. The CLI-startup caller passes emitNotes: false and a confirmAutoFix: async () => true prompter so the doctor function runs silently.

Out of scope

  • Removing the legacy sidecar code paths entirely. Tracked separately and should wait for the long tail to migrate.
  • Changes to openclaw doctor --fix behavior itself; the migration path it runs is already correct.

Related

@RomneyDa RomneyDa requested a review from a team as a code owner May 22, 2026 01:41
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime cli CLI command changes commands Command implementations agents Agent runtime and tooling size: L maintainer Maintainer-authored PR labels May 22, 2026
@clawsweeper

clawsweeper Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Codex review: found issues before merge. Reviewed May 24, 2026, 7:05 PM ET / 23:05 UTC.

Summary
The PR adds a Darwin/TTY runCli hook that silently invokes the legacy Codex OAuth sidecar doctor migration, adds a headless warning, updates doctor wording/docs/changelog, and adds regression tests.

PR surface: Source +176, Tests +508, Docs +1. Total +685 across 10 files.

Reproducibility: yes. Source inspection shows that on macOS with an interactive TTY and a migratable legacy sidecar, an ordinary command such as openclaw status reaches the new runCli hook before normal routing and calls the doctor migration with no in-CLI confirmation.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🌊 off-meta tidepool
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Risk before merge

  • The startup hook can show a macOS Keychain dialog and mutate OAuth profile storage before ordinary commands like openclaw status run, changing existing CLI startup semantics.
  • The branch intentionally relies on OS dialog consent instead of command-level no-prompt contracts, so scripted or TTY-attached automation remains compatibility-sensitive.
  • The permanent decline marker and OAuth profile migration behavior are upgrade-sensitive and need explicit maintainer approval if repair moves outside doctor.

Maintainer options:

  1. Prefer the doctor-only alternative (recommended)
    Close or supersede this branch in favor of fix(auth): point Keychain-only legacy Codex OAuth users at doctor instead of auto-prompting #86220 if maintainers agree that legacy repair should remain in doctor with a one-shot headless warning.
  2. Rewrite this branch to doctor-only
    Remove the runCli hook, decline marker, env opt-out, and auto-heal docs, keeping only the headless warning, wording cleanup, and focused tests.
  3. Explicitly accept startup migration
    Maintainers could intentionally approve ordinary macOS CLI invocations prompting and migrating credentials, but should require clear upgrade proof and docs for that contract change.

Next step before merge
Manual maintainer decision is needed because the branch deliberately changes startup/auth migration UX, has a protected maintainer label, and now has an open doctor-only alternative PR.

Security
Cleared: No concrete supply-chain issue was found; the diff adds no dependencies and does not log token material, while the secrets-handling UX remains covered by the functional merge risk.

Review findings

  • [P1] Keep legacy OAuth repair out of CLI startup — src/cli/run-main.ts:598-605
Review details

Best possible solution:

Prefer the doctor-owned path: keep migration in openclaw doctor --fix, add the one-shot headless warning and wording cleanup, and close or supersede this startup-hook branch if maintainers agree with the alternative direction.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows that on macOS with an interactive TTY and a migratable legacy sidecar, an ordinary command such as openclaw status reaches the new runCli hook before normal routing and calls the doctor migration with no in-CLI confirmation.

Is this the best way to solve the issue?

No. The branch solves a real auth migration gap, but moving legacy repair into startup conflicts with current repo policy and the author's later manual-testing conclusion; the doctor-only warning path is the safer maintainable direction unless maintainers explicitly approve the UX change.

Full review comments:

  • [P1] Keep legacy OAuth repair out of CLI startup — src/cli/run-main.ts:598-605
    The new startup hook runs before ordinary command routing and then calls the doctor migration with confirmAutoFix: async () => true, so an interactive macOS command can pop Keychain and rewrite auth profiles before the requested command runs. Root policy and current docs keep this legacy repair in openclaw doctor --fix; remove the startup migration path or get an explicit maintainer decision for that upgrade contract change.
    Confidence: 0.94

Overall correctness: patch is incorrect
Overall confidence: 0.91

Codex review notes: model gpt-5.5, reasoning high; reviewed against 3c8d101f5a85.

Label changes

Label justifications:

  • P2: The PR targets a real but narrow legacy Codex OAuth migration gap for macOS Keychain-only profiles after upgrade.
  • merge-risk: 🚨 compatibility: The PR changes startup behavior for existing macOS CLI users by allowing an ordinary interactive invocation to prompt and migrate auth profiles.
  • merge-risk: 🚨 automation: A pre-command Keychain dialog can block TTY-attached scripts or maintenance commands before the requested command runs.
  • merge-risk: 🚨 auth-provider: The changed path reads Keychain-backed Codex OAuth material and rewrites persisted auth-profile credentials.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🌊 off-meta tidepool and patch quality is 🧂 unranked krab.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The contributor proof gate is not applicable because this is a maintainer-authored PR with a protected maintainer label; the blocker is the startup auth-migration contract, not missing contributor proof.
Evidence reviewed

PR surface:

Source +176, Tests +508, Docs +1. Total +685 across 10 files.

View PR surface stats
Area Files Added Removed Net
Source 4 179 3 +176
Tests 4 514 6 +508
Docs 2 2 1 +1
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 10 695 10 +685

What I checked:

Likely related people:

  • RomneyDa: Auth-sidecar follow-up history includes merged embedded loader and missing-refresh repairs that this PR builds on, and the current PR/alternative PR are both authored by this person. (role: recent auth repair contributor; confidence: high; commits: 4399eee6e0e4, 205c595b134c, 6630e275497b; files: src/agents/auth-profiles/store.ts, src/agents/auth-profiles/oauth-manager.ts, src/cli/auto-migrate-legacy-oauth-sidecar.ts)
  • joshavant: Merged history for Keep legacy Codex OAuth sidecar profiles usable #83312 introduced the legacy Codex OAuth sidecar compatibility and migration surface used by this PR. (role: legacy sidecar compatibility introducer; confidence: high; commits: 06f4c97130c6; files: src/agents/auth-profiles/legacy-oauth-sidecar.ts, src/agents/auth-profiles/oauth-manager.ts)
  • Totalsolutionsync: Related PR discussion credits this contributor's live gateway OAuth-regression work as the source of the embedded sidecar loader repair that narrowed the remaining Keychain-only bucket. (role: related regression reporter and patch source; confidence: medium; commits: 4399eee6e0e4; files: src/agents/auth-profiles/store.ts)
  • Vincent Koc: Current-main shallow blame/log history for the central auth-sidecar files points to a recent split-agent shard repair commit, so this is a weak recent-touch routing signal rather than feature ownership. (role: recent area contributor; confidence: low; commits: 125d82cab295; files: src/agents/auth-profiles/legacy-oauth-sidecar.ts, src/commands/doctor-auth-oauth-sidecar.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels May 22, 2026
@clawsweeper

clawsweeper Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🔥 Warming up: real-behavior proof passed; findings, security review, or rank-up moves are still in progress.

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.
What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@RomneyDa RomneyDa force-pushed the codex-oauth-keychain-migration branch from 633f8c1 to 02aeeec Compare May 22, 2026 01:54
@RomneyDa RomneyDa self-assigned this May 22, 2026
@RomneyDa

Copy link
Copy Markdown
Member Author

Addressed the P2 finding.

Root cause: hasSidecarFiles() only checked the sidecar directory for any .json file. Doctor intentionally retains unreferenced legacy sidecars (external agent dirs may still reference them), so accepting the startup prompt for an unreferenced-only state did zero work and the next interactive command would prompt again.

Fix: Gate the startup hook on the same active oauthRef-backed store detection doctor already runs. New hasMigratableLegacyOAuthSidecarStores({ cfg, env }) returns true iff some auth-profiles store has a profile with oauthRef. The auto-migrate hook calls it right after loading config; if false (no migratable profile — only unreferenced sidecars or nothing), it returns without prompting. Doctor's retention policy is preserved; openclaw doctor --fix stays the explicit path for cleaning up unreferenced sidecars.

Regression coverage: new test does not prompt when only unreferenced sidecar files exist (no migratable oauthRef profile) writes an encrypted sidecar JSON with no matching auth-profile, runs the auto-migrate twice, asserts the prompter is never invoked, the sidecar file remains on disk, and no decline marker is written.

Verification

$ node scripts/run-vitest.mjs \
    src/cli/auto-migrate-legacy-oauth-sidecar.test.ts \
    src/agents/auth-profiles/legacy-oauth-sidecar.test.ts \
    src/commands/doctor-auth-oauth-sidecar.test.ts \
    src/commands/doctor/repair-sequencing.test.ts \
    src/agents/auth-profiles/persisted-boundary.test.ts \
    src/commands/agents.add.test.ts \
    src/commands/doctor/shared/stale-oauth-profile-shadows.test.ts
Test Files  9 passed (9)
     Tests  55 passed (55)

$ node scripts/run-oxlint.mjs src/cli/auto-migrate-legacy-oauth-sidecar.ts src/cli/auto-migrate-legacy-oauth-sidecar.test.ts src/commands/doctor-auth-oauth-sidecar.ts
$ npx oxfmt --check src/cli/auto-migrate-legacy-oauth-sidecar.ts src/cli/auto-migrate-legacy-oauth-sidecar.test.ts src/commands/doctor-auth-oauth-sidecar.ts
All matched files use the correct format.

Head: d592e96

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 22, 2026
RomneyDa added 3 commits May 22, 2026 12:08
…first interactive CLI run

Closes the remaining bucket from #85083: macOS users whose legacy `oauthRef`
sidecar seed lives only in the Keychain still had to know to run
`openclaw doctor --fix` after upgrade, because the embedded-runtime self-heal
added in #85074 intentionally runs with `allowKeychainPrompt: false`.

This change adds a tiny pre-command trigger in `runCli` that, on darwin + TTY
only, reuses the existing `maybeRepairLegacyOAuthSidecarProfiles` doctor flow
when a sidecar dir is present. Fast bail-outs make the cost a single
`readdirSync` for users without legacy sidecars. Declining writes a 24h
cool-off marker in the state dir so we never pester the user every command.

Headless paths (cron, systemd, Telegram polling, embedded sub-agent dispatch)
still skip the prompt, but the legacy sidecar loader now emits a one-shot
warn that names `openclaw doctor --fix` and macOS Keychain, instead of
letting the credential silently fall through to a downstream
"No API key found".

Also drops the user-confusing `sidecar-backed` phrasing from the doctor
prompt/notes/changes; existing internal identifier names are left intact.
…d sidecars exist

Doctor intentionally retains unreferenced legacy Codex OAuth sidecar files
because external agent directories outside the scan may still reference them.
Before this change the startup auto-migration hook treated any sidecar JSON as
trigger material, so accepting the prompt for an unreferenced-only state did
zero work and re-prompted on every later interactive command.

Gate the startup hook on the same active oauthRef-backed store detection
doctor already runs, so:

- Unreferenced-only sidecars no longer trigger a no-op prompt loop.
- Doctor's existing retention policy (and explicit `openclaw doctor --fix`
  flow) stays the canonical place to surface unreferenced sidecar files.
- A new regression test exercises the unreferenced-only case end-to-end.
…wConsole.warn cast

CI/Linux failures on PR #85163:

- `checks-node-agentic-cli`: 2 happy-path auto-migrate tests inherited `CI=true`
  from the GitHub Actions process env via `createSpawnEnv`, which short-circuited
  the macOS-interactive skip rules and left `confirm` uncalled. Clear `CI` and
  related non-interactive signals when building the test state env.
- `check-test-types`: `warnSpy: ReturnType<typeof vi.fn>` widens to
  `Mock<Constructable | Procedure>`, which is not assignable to
  `rawConsole.warn: typeof console.warn`. Cast at the assignment site.
@RomneyDa RomneyDa force-pushed the codex-oauth-keychain-migration branch from d592e96 to 781edbd Compare May 22, 2026 19:15
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 22, 2026
@clawsweeper clawsweeper Bot added status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 23, 2026
The 24h cool-off was too short — users who say "not now" are
deliberately opting out, and re-prompting them every day was nag-y.
Switch the decline marker to permanent: if the marker file exists, skip
the prompt forever. `openclaw doctor --fix` remains the explicit
retry path for anyone who changes their mind, and
`OPENCLAW_AUTO_MIGRATE_LEGACY_OAUTH_SIDECAR=0` is still the env-level
opt-out.

Drops `COOL_OFF_TTL_MS`, the `now` plumbing through `shouldSkip`,
and the now-dead `changes-driven` marker cleanup (after the
migratable-store gate, an accepted run that produced no changes is
unreachable). Renames the constant to `DECLINE_MARKER_FILENAME` and
updates docs + changelog wording.
@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 24, 2026
RomneyDa added 2 commits May 24, 2026 13:53
The hook must stay above the bare-root Crestodian and gateway fast
paths because those branches return early and the embedded runtimes
they start cannot prompt for the macOS Keychain on their own. The
ordering invariant was easy to miss when restructuring run-main and
caused a P2 finding during PR review; pin it with an inline note so
future refactors don't regress the gateway/TUI launch paths.
…input

`--force` (`openclaw agent prune --force`, `plugins uninstall --force`,
`migrate apply --force`) and `--no-input` (`openclaw models scan
--no-input`) are documented prompt-suppression flags, but the startup
hook was still prompting on those invocations. Add both to the skip
path.

Consolidate the per-flag `hasFlag` calls into a single
`NO_PROMPT_FLAGS` allowlist so the next flag is a one-line addition
and the set is grep-able for code review. `hasJsonOutputFlag` stays
separate because its parsing rules (`--json=...` value form) differ
from `hasFlag`.

Regression coverage:
- `plugins uninstall pkg --force`
- `agent prune old --force`
- `models scan --no-input`
- `status -- --force` and `status -- --no-input` (positive cases:
  terminator preserved, prompt still fires)
@RomneyDa

Copy link
Copy Markdown
Member Author

Addressed the P1.

Root cause: two more documented no-prompt flags weren't in the skip set —

  • --force: openclaw agent prune --force ("Skip confirmation"), plugins uninstall --force, migrate apply --force
  • --no-input: openclaw models scan --no-input ("Disable prompts (use defaults)")

Fix (007aa38370): consolidated the per-flag hasFlag calls into a single NO_PROMPT_FLAGS = ["--non-interactive", "--yes", "--no-input", "--force"] set and added both new flags. Adding the next flag is now a one-line change. hasJsonOutputFlag stays separate because its parsing rules (--json=... value form) differ from hasFlag.

Regression coverage:

  • plugins uninstall pkg --force
  • agent prune old --force
  • models scan --no-input
  • status -- --force and status -- --no-input (positive cases: terminator preserved, prompt still fires)

Verification

$ CI=true node scripts/run-vitest.mjs \
    src/cli/auto-migrate-legacy-oauth-sidecar.test.ts \
    src/agents/auth-profiles/legacy-oauth-sidecar.test.ts \
    src/commands/doctor-auth-oauth-sidecar.test.ts
Test Files  4 passed (4)
     Tests  42 passed (42)

$ node scripts/run-oxlint.mjs src/cli/auto-migrate-legacy-oauth-sidecar.ts src/cli/auto-migrate-legacy-oauth-sidecar.test.ts
$ npx oxfmt --check src/cli/auto-migrate-legacy-oauth-sidecar.ts src/cli/auto-migrate-legacy-oauth-sidecar.test.ts
All matched files use the correct format.

Head: 007aa38370

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@RomneyDa

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

RomneyDa added 2 commits May 24, 2026 14:12
…edentials migrate

Previously the decline marker was only written on explicit decline / Ctrl+C.
If the user accepted the clack confirm but the migration produced no changes
(macOS Keychain "Deny", wrong env seed, ciphertext corruption, etc.) we
left no marker, and the next interactive command would re-show the confirm
and re-trigger the macOS Keychain prompt — the only spam loop in the hook.

Treat any "prompted, no changes happened" outcome as a back-off signal and
write the marker. The user is not stuck — `openclaw doctor --fix` is the
explicit retry path and `OPENCLAW_AUTO_MIGRATE_LEGACY_OAUTH_SIDECAR=0`
silences the hook entirely. Doctor itself still emits the
"Could not decrypt legacy OAuth sidecar … re-authenticate this profile"
warning during the failed attempt, so the user knows what happened.

Drops the now-unused `declined` flag.
Keep only comments whose absence would invite a regression:
- `run-main.ts` ordering constraint (hook must precede the gateway /
  bare-root fast-path returns).
- `auto-migrate` zero-changes-writes-marker rationale (the obvious
  "only on decline" simplification reintroduces the Keychain prompt
  spam loop).
- Test ciphertext-corruption note (so a future refactor doesn't assume
  the corruption is a bug and "fix" the test).

Removes self-evident narration on the `NO_PROMPT_FLAGS` list,
test-only assertion explanations, and intermediate setup notes.
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. labels May 24, 2026
@RomneyDa

Copy link
Copy Markdown
Member Author

After some manual testing I feel this is not the right approach, it's an obtrusive UX and feels like an antipattern. Looking into an option where I just make the log more informative.

@RomneyDa RomneyDa reopened this May 24, 2026
@RomneyDa

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@RomneyDa

Copy link
Copy Markdown
Member Author

Re. ClawSweeper P1 on src/cli/run-main.ts:598-605 ("Keep legacy OAuth repair out of CLI startup"):

The AGENTS.md rule (Legacy config repair belongs in openclaw doctor --fix, not startup/load-time core migrations) reads to me as scoped to legacy config repair — the tlon / agent-dir / state migrations that prompted it. OAuth credential self-healing has shipped extensively outside doctor with no exception flagged, and is the established pattern for this surface:

  • Silent OAuth refresh-on-expiry + write-back during ordinary command execution — src/agents/auth-profiles/oauth-manager.ts:540-559
  • Sub-agent silently adopts newer main-agent credentials at runtime — src/agents/auth-profiles/oauth-manager.ts:315-353
  • Externally-managed credentials silently overwrite persisted ones — src/agents/auth-profiles/oauth-manager.ts:517-522
  • Post-refresh credentials mirrored back into the main store from a sub-agent dir — src/agents/auth-profiles/oauth-manager.ts:563-567
  • Legacy oauthRef sidecar already inline-merged at runtime on every credential read — src/agents/auth-profiles/persisted.ts:284-307. This PR is a natural extension of that exact pattern: same env → file → Keychain seed cascade, just also persists the inline result so we stop re-decrypting on every read.

OAuth sidecars aren't really "config" in the AGENTS.md sense — they're credentials, and the OAuth credential layer already has well-established runtime self-healing seams. The doctor-only alternative (#86220) is on the table if you'd rather take the conservative UX path, but the contract-violation framing here doesn't match what OAuth-credential code in this repo actually does today.

@RomneyDa

Copy link
Copy Markdown
Member Author

Once again closing this because it feels pretty intrusive. Will prefer #86220 for now

@RomneyDa RomneyDa closed this May 25, 2026
@RomneyDa

Copy link
Copy Markdown
Member Author

Closing in favor of #86220 (doctor-only with a one-shot headless warn). Leaving this comment as a retrospective on the concerns that drove the decision, in case the design comes up again later.

What this PR tried to do

Close the macOS Keychain-only bucket from #85083 by reusing the existing maybeRepairLegacyOAuthSidecarProfiles doctor flow as a pre-command hook in runCli. Went through three shapes:

  1. Auto-prompt — clack confirm before invoking the doctor migration. Required a growing skip ladder for every documented no-prompt CLI surface (--non-interactive/--yes/--no-input/--force/--json, then channels remove --delete).
  2. Silent auto-heal — dropped the clack confirm; macOS Keychain dialog as the only consent surface. Smaller skip ladder, but removed the user's chance to opt out before the OS dialog.
  3. Closed.

Why neither shape shipped

Architecture tension

The AGENTS.md rule Legacy config repair belongs in openclaw doctor --fix, not startup/load-time core migrations reads narrowly (legacy config repair, the tlon / state cases). OAuth runtime self-healing is already widespread outside doctor:

  • Silent OAuth refresh-on-expiry + write-back during ordinary command execution — oauth-manager.ts:540-559
  • Sub-agent silently adopts newer main-agent credentials — oauth-manager.ts:315-353
  • Externally-managed credentials silently overwrite persisted ones — oauth-manager.ts:517-522
  • Refresh mirrored back into the main store from a sub-agent dir — oauth-manager.ts:563-567
  • Legacy oauthRef sidecar already inline-merged on every credential read — persisted.ts:284-307

So the policy-violation framing was contestable. But ClawSweeper rated this 🦐 → 🧂 across the two shapes precisely because the pattern here (pop a Keychain dialog before an arbitrary command, mutate auth-profile credentials before the requested command runs) is louder than any of the existing precedents. Even with the AGENTS.md ambiguity resolved, the UX shape was the real blocker.

Concrete release risks

What killed it for me on a release timeline:

Tier 1 — could spam users

  • Cross-version Keychain re-prompt. macOS Keychain ACLs are keyed by the resolved caller chain at write time. security find-generic-password invoked from node doesn't guarantee a stable ACL identity across openclaw upgrades — different Homebrew/npm install paths or signing identity changes can invalidate the ACL → dialog re-pops on the first interactive command after every upgrade. Not in-version spam (marker would block that), but every release upgrade re-prompting feels like spam from the user's POV. Can't engineer around — it's an OS contract.
  • Marker write failure → unbounded dialog spam. fs.mkdirSync + fs.writeFileSync for the decline marker are inside the outer try {} catch {} that swallows everything. If a user's state dir has any write quirk (read-only mount, restrictive ACLs, sandboxed home, FileVault transient), the marker silently fails → next run sees no marker → dialog again. Forever, every command. Low probability, catastrophic when it lands.

Tier 2 — surprise, not spam

  • Dialog during destructive primaries. SKIPPED_PRIMARIES covered doctor/update/help/completion/version. uninstall, reset, agents remove, secrets, auth, crestodian, setup, onboard all would have fired the dialog before doing their thing. Worst case: user runs openclaw uninstall, gets a Keychain credential migration dialog, then their uninstall. UX violation that's hard to defend.
  • Dynamic import failure breaks runCli on darwin. The import("./auto-migrate-legacy-oauth-sidecar.js") in run-main.ts:597-605 was not wrapped in try/catch at the call site (only the function body had its own). A corrupt build or transitive module-load error would make every interactive darwin invocation throw. One bad release = every macOS user's CLI hangs on every command.
  • Dialog on openclaw status. status is canonically the read-only check command. Popping any prompt there breaks an implicit contract users rely on for "just look without changing anything."

Tier 3 — design choice that compounds the above

  • The auto-prompt shape gated all of this on a clack (Y/n) the user could decline — bad UX, but recoverable. The silent auto-heal removed that opt-out, so the OS Keychain dialog became the first signal the user had that anything was happening. That made every Tier 1/2 risk land harder because there's no "I changed my mind" surface before the OS dialog.

What pushed me over

The author follow-up captures it: "manual testing made this approach feel obtrusive and like an antipattern." ClawSweeper called the same thing. Both shapes traded a real architectural cost (startup-time auth migration with a wide blast surface) for a marginal UX improvement (users don't have to know openclaw doctor --fix exists). The trade isn't worth it when the alternative (#86220) is two-thirds smaller, ships the actionable headless warn that solves the discoverability problem anyway, and doesn't violate any rules the maintainer would have to weigh.

What's replacing it

#86220 — doctor-only path: keeps the one-shot headless log.warn from this branch (legacy-oauth-sidecar.ts) that names openclaw doctor --fix and macOS Keychain, plus the user-facing wording cleanup in the doctor prompt/notes/changes. +183/-10 across 7 files vs. this branch's +685/-10 across 10. The skip-rule treadmill goes away because there's nothing to skip. Affected users see one actionable warning telling them what to run; doctor remains the only place that prompts. No new failure modes, no startup-time auth mutation, no AGENTS.md tension.

If a future maintainer wants the auto-heal UX back, the receipts are above: the precedent argument is real, the policy ambiguity is real, but the release-risk profile (cross-version Keychain re-prompt + marker-write spam loop + destructive-primary surprise dialogs + unguarded dynamic import) needs to be addressed before it's safe to ship. None of those are blockers individually; they're load-bearing collectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling cli CLI command changes commands Command implementations docs Improvements or additions to documentation gateway Gateway runtime maintainer Maintainer-authored PR merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 automation 🚨 May affect CI, automerge, proof capture, label sync, or maintainer automation. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: L status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Legacy Codex OAuth sidecars stored only in macOS Keychain still require doctor for embedded runtime path

2 participants