fix(doctor): preserve runtime pins during codex route repair [AI-assisted] by nxmxbbd · Pull Request #84862 · openclaw/openclaw

nxmxbbd · 2026-05-21T06:58:40Z

Summary

Problem: openclaw doctor --fix can rewrite openai-codex/* routes to canonical openai/* while replacing an explicit non-default agentRuntime pin, such as pi, with codex.
Solution: carry the pre-repair runtime pin through the route migration and write a narrow canonical model-scoped runtime policy when needed.
What changed: preserve explicit runtime pins from legacy agent model maps, provider-level runtime policy, and provider catalog model policy; add regression coverage for those cases and for listed-agent shielding.
What did NOT change (scope boundary): this does not investigate or claim to fix the separate gpt-5.5 token-usage behavior discussed on [Bug]: doctor --fix silently migrates intentional openai-codex/ config to openai/, breaking PI+OAuth runtime and causing 3-4x token inflation #84038, and it does not edit CHANGELOG.md.

Motivation

#84362 is the ClawSweeper replacement for #84142, but that bot-owned branch is currently blocked by a CHANGELOG.md conflict. This PR provides the same fix path on current upstream/main, deliberately avoiding the contributor-forbidden changelog edit, and includes the tested follow-up runtime-pin cases.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related [Bug]: doctor --fix silently migrates intentional openai-codex/ config to openai/, breaking PI+OAuth runtime and causing 3-4x token inflation #84038
Related fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted] #84142
Related fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted] #84362
This PR fixes a bug or regression

Real behavior proof (required for external PRs)

Behavior or issue addressed: openclaw doctor --fix route repair preserves explicit non-default runtime pins when migrating openai-codex/gpt-5.4 to openai/gpt-5.4.
Real environment tested: local OpenClaw source checkout on Linux, Node v22.22.1, current upstream/main branch with this patch applied.
Exact steps or command run after this patch:

OPENCLAW_REPO=/tmp/openclaw-84038-clean-20260521 pnpm exec tsx /tmp/repro-openclaw-84038-runtime-pin-matrix.mjs

Evidence after fix:

Terminal output from the runtime resolver / doctor repair matrix:

scenario: model-map PI pin survives legacy openai-codex model map rewrite
beforeRuntime: pi
defaultPolicy: { id: pi }
afterRuntime: pi
pass: true

scenario: provider-level PI pin survives route repair
beforeRuntime: pi
defaultPolicy: { id: pi }
afterRuntime: pi
pass: true

scenario: provider-catalog model PI pin survives route repair
beforeRuntime: pi
defaultPolicy: { id: pi }
afterRuntime: pi
pass: true

scenario: defaults PI repair does not leak into listed canonical Codex agent
beforeRuntime: { defaults: pi, regular: codex }
defaultPolicy: { id: pi }
listedPolicy: { id: codex }
afterRuntime: { defaults: pi, regular: codex }
pass: true

GREEN: all runtime pin repair scenarios preserved expected runtimes

Observed result after fix: legacy openai-codex/* routes are rewritten to canonical openai/*, while explicit pi runtime pins remain effective for the repaired route. A listed agent that already explicitly uses canonical OpenAI with Codex routing keeps codex instead of inheriting the repaired defaults-level pi policy.
What was not tested: full CI, a live PI OAuth login, and the separate gpt-5.5 token-usage behavior discussed in [Bug]: doctor --fix silently migrates intentional openai-codex/ config to openai/, breaking PI+OAuth runtime and causing 3-4x token inflation #84038.
Before evidence (optional but encouraged): the same matrix on current upstream/main repaired all four scenarios to agentRuntime.id = "codex" and ended with RED: one or more runtime pins were not preserved across doctor route repair.

Root Cause (if applicable)

Root cause: route repair synthesized a canonical openai/* Codex runtime policy before considering explicit pre-repair non-default runtime pins from legacy openai-codex/* model-map entries, provider-level policy, or provider catalog model policy.
Missing detection / guardrail: existing tests covered the default Codex repair path, but not the explicit non-Codex runtime-pin carriers or the listed-agent shield when defaults repair introduces a non-Codex runtime policy.
Contributing context (if known): older route-repair behavior needed to preserve Codex auth routing after canonicalization, but that default should not override an explicit user runtime choice.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/commands/doctor/shared/codex-route-warnings.test.ts
Scenario the test should lock in: model-map, provider-level, and provider-catalog runtime pins survive route repair; defaults-level preservation does not leak into listed canonical Codex agents.
Why this is the smallest reliable guardrail: maybeRepairCodexRoutes plus the runtime resolver exercises the doctor repair output and the effective runtime selection without needing a live provider credential.
Existing test that already covers this (if any): existing Codex route-repair tests covered canonical Codex behavior, not these explicit runtime-pin preservation cases.
If no new test is added, why not: N/A

User-visible / Behavior Changes

openclaw doctor --fix still canonicalizes legacy openai-codex/* route refs, but now preserves explicit non-default runtime pins instead of silently moving those repaired routes onto Codex runtime.

Diagram (if applicable)

Before:
openai-codex/gpt-5.4 + explicit pi runtime -> doctor --fix -> openai/gpt-5.4 + codex runtime

After:
openai-codex/gpt-5.4 + explicit pi runtime -> doctor --fix -> openai/gpt-5.4 + pi runtime

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: Linux
Runtime/container: Node v22.22.1 local source checkout
Model/provider: openai-codex/gpt-5.4 repaired to openai/gpt-5.4
Integration/channel (if any): N/A
Relevant config (redacted): synthetic doctor config with explicit agentRuntime.id = "pi" in model-map, provider-level, and provider-catalog shapes

Steps

Run the runtime-pin matrix against current upstream/main and observe RED.
Apply this patch.
Rerun the runtime-pin matrix and targeted doctor tests.

Expected

Repaired canonical openai/gpt-5.4 routes preserve explicit pi runtime pins.
Existing Codex route repair behavior remains green.

Actual

Matrix proof is GREEN after this patch.
Targeted tests pass.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Verification commands run locally:

OPENCLAW_REPO=/tmp/openclaw-84038-clean-20260521 pnpm exec tsx /tmp/repro-openclaw-84038-runtime-pin-matrix.mjs
node scripts/run-vitest.mjs src/commands/doctor/shared/codex-route-warnings.test.ts
node scripts/run-vitest.mjs src/commands/doctor/repair-sequencing.test.ts
node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfo
pnpm exec oxfmt --check --threads=1 src/commands/doctor/shared/codex-route-warnings.ts src/commands/doctor/shared/codex-route-warnings.test.ts
git diff --check upstream/main..HEAD

Results:

runtime-pin matrix: GREEN
codex-route-warnings.test.ts: 103 passed
repair-sequencing.test.ts: 10 passed
tsgo core: exit 0
oxfmt touched files: All matched files use the correct format.
git diff --check: passed

Human Verification (required)

Verified scenarios: model-map runtime pin, provider-level runtime pin, provider-catalog model runtime pin, listed-agent shielding, existing Codex repair tests, repair sequencing tests, typecheck, formatting, and whitespace checks.
Edge cases checked: auto / default runtime policy is not treated as an explicit pin; listed canonical agents keep Codex when defaults repair preserves PI; contributor PR avoids CHANGELOG.md.
What you did not verify: full CI, live OAuth login, runtime token consumption, and maintainer-side ClawSweeper automerge.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

No bot review conversations exist on this new PR at creation time.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: preserving a non-Codex runtime pin during defaults repair could accidentally affect listed agents that explicitly use canonical OpenAI/Codex routing.
- Mitigation: the patch shields explicitly listed canonical agents and adds a regression test/proof where defaults remain pi while the listed agent remains codex.
Risk: provider-level and provider-catalog policy lookup could be broader than intended.
- Mitigation: tests cover model-map precedence, provider-level pins, provider-catalog model pins, and existing repair sequencing remains green.

AI assistance disclosure

This PR is AI-assisted. I reviewed the diff and ran the verification listed above.

clawsweeper · 2026-05-21T07:00:26Z

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

Keep open: the PR addresses the P1 doctor runtime-pin regression with solid terminal proof, but it is not merge-ready because the branch drops a current-main doctor warning/test for malformed Codex app-server command overrides and overlaps an existing ClawSweeper replacement branch.

Canonical path: Close this PR as superseded by #84362.

So I’m closing this here and keeping the remaining discussion on #84362.

Review details

Best possible solution:

Close this PR as superseded by #84362.

Do we have a high-confidence way to reproduce the issue?

Yes for the core bug: the PR body includes before/after terminal matrix output, and source inspection shows current-main route repair can synthesize codex runtime policy before legacy pins are carried forward. I did not rerun tests because this review was read-only.

Is this the best way to solve the issue?

No, not as submitted: the runtime-pin preservation approach is plausible and well covered, but the branch must restore the dropped app-server command warning/test before it is the narrowest maintainable fix.

Security review:

Security review cleared: No concrete security or supply-chain issue was found; the diff changes local doctor config repair logic and tests without new dependencies, workflows, permissions, network calls, or secret handling.

AGENTS.md: found and applied where relevant.

What I checked:

linked superseding PR: fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted] #84362 (fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted]) is still open as the canonical replacement.
cluster evidence: the durable review links that PR in the work cluster or recommended risk path.
no human follow-up: live comments and timeline hydrated by apply contain no non-automation activity after the ClawSweeper review.

Likely related people:

vincentkoc: git blame on current main attributes the central doctor route repair region and the app-server warning diagnostic that this PR drops to commit 500c95b1ba8a7cae365fa3caa0700663a19aa619. (role: recent area contributor; confidence: high; commits: 500c95b1ba8a; files: src/commands/doctor/shared/codex-route-warnings.ts, src/commands/doctor/shared/codex-route-warnings.test.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 908b894432c4.

clawsweeper · 2026-05-21T07:06:26Z

ClawSweeper PR egg

✨ Hatched: 🥚 common Moonlit Proofling

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: stacks clean commits.
Image traits: location merge queue dock; accessory release bell; palette sunrise gold and clean white; mood sparkly; pose guarding a tiny green check; shell woven fiber shell; lighting soft studio lighting; background gentle dashboard dots.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Moonlit Proofling in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

nxmxbbd · 2026-05-24T19:50:48Z

@clawsweeper hatch

clawsweeper · 2026-05-24T19:50:51Z

🦞👀
ClawSweeper PR egg hatch requested.

I queued a comment sync for this PR. If the egg is hatchable, ClawSweeper will generate the image once and update the existing review comment.
Action: PR egg hatch queued (workflow sweep.yml, event repository_dispatch).
The ASCII egg stays as the fallback.

clawsweeper · 2026-05-24T19:56:28Z

ClawSweeper could not hatch this PR egg yet.

Reason: there is no current durable ClawSweeper review record for this PR, so there is no PR egg state record to update.
Ask for @clawsweeper re-review first, then retry @clawsweeper hatch after the ClawSweeper review comment appears.

nxmxbbd · 2026-05-25T05:09:54Z

@clawsweeper re-review

clawsweeper · 2026-05-25T05:09:57Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26384386398
Updated: 2026-05-25T05:22:09.022Z

clawsweeper · 2026-05-25T05:22:00Z

ClawSweeper applied the proposed close for this PR.

Action: closed this PR.
Close reason: duplicate or superseded.
Evidence: durable ClawSweeper review.

fix(doctor): preserve runtime pins during codex route repair

b03071a

openclaw-barnacle Bot added commands Command implementations size: L proof: supplied External PR includes structured after-fix real behavior proof. labels May 21, 2026

clawsweeper Bot closed this May 25, 2026

nxmxbbd mentioned this pull request May 25, 2026

fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted] #84362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(doctor): preserve runtime pins during codex route repair [AI-assisted]#84862

fix(doctor): preserve runtime pins during codex route repair [AI-assisted]#84862
nxmxbbd wants to merge 1 commit into
openclaw:mainfrom
nxmxbbd:nex/84038-doctor-runtime-pin-clean-20260521

nxmxbbd commented May 21, 2026

Uh oh!

clawsweeper Bot commented May 21, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 21, 2026

Uh oh!

nxmxbbd commented May 24, 2026

Uh oh!

clawsweeper Bot commented May 24, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 24, 2026

Uh oh!

nxmxbbd commented May 25, 2026

Uh oh!

clawsweeper Bot commented May 25, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

nxmxbbd commented May 21, 2026

Summary

Motivation

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Real behavior proof (required for external PRs)

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

AI assistance disclosure

Uh oh!

clawsweeper Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 21, 2026

Hatch command

Uh oh!

nxmxbbd commented May 24, 2026

Uh oh!

clawsweeper Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 24, 2026

Uh oh!

nxmxbbd commented May 25, 2026

Uh oh!

clawsweeper Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented May 21, 2026 •

edited

Loading

clawsweeper Bot commented May 24, 2026 •

edited

Loading

clawsweeper Bot commented May 25, 2026 •

edited

Loading