Skip to content

fix(agents): normalize prefixed Anthropic fallback model ids (#88560)#88587

Merged
steipete merged 1 commit into
openclaw:mainfrom
TurboTheTurtle:fix/fallback-modelid-normalization-88560
May 31, 2026
Merged

fix(agents): normalize prefixed Anthropic fallback model ids (#88560)#88587
steipete merged 1 commit into
openclaw:mainfrom
TurboTheTurtle:fix/fallback-modelid-normalization-88560

Conversation

@TurboTheTurtle

@TurboTheTurtle TurboTheTurtle commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Fixes prefixed Anthropic static catalog ids such as anthropic/claude-haiku-4-5 resolving as a literal nested model id.
  • Keeps fallback candidate normalization independent so a prefixed catalog key does not leak into later provider/model lookups.
  • Adds regression coverage at the shared model-catalog normalization layer, OpenClaw model-ref wrapper, embedded model lookup, and fallback candidate chain.
  • Intentionally out of scope: broader provider alias policy changes outside the native Anthropic provider prefix case.

Linked context

Closes #88560

Related #88517, #77167, #88470

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: A configured fallback chain with prefixed Anthropic entries should resolve each candidate to its own provider/model pair, and Anthropic static catalog ids should not retain a nested anthropic/ prefix.

  • Real environment tested: Local OpenClaw source checkout at /private/tmp/openclaw-88560 on macOS, branch fix/fallback-modelid-normalization-88560, rebased onto f7a1d3f3f6, running the repo's real fallback runtime path, Vitest, and tsgo test runners against the changed code.

  • Exact steps or command run after this patch: timeout 180 node scripts/test-projects.mjs packages/model-catalog-core/src/provider-model-id-normalization.test.ts src/agents/model-ref-shared.test.ts src/agents/embedded-agent-runner/model.test.ts src/agents/model-fallback.test.ts; node --import tsx -e '<inline script invoking runWithModelFallback with prefixed Anthropic primary/fallbacks and redacted local FailoverError failures>'.

  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): After rebasing to f7a1d3f3f6, the focused test run passed 3 Vitest shards: unit-fast 1 passed / 12 tests, unit 1 passed / 3 tests, agents 2 passed / 177 tests. git diff --check exited 0. Type checks exited 0 for node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.test.src.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/test-src-88560-rebased-f7a1.tsbuildinfo and node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.test.packages.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/test-packages-88560-rebased-f7a1.tsbuildinfo.

    Redacted local runtime proof from the real runWithModelFallback path after rebasing to f7a1d3f3f6:

    [model-fallback/decision] model fallback decision: decision=candidate_failed requested=anthropic/anthropic/claude-sonnet-4-6 candidate=anthropic/claude-sonnet-4-6 reason=server_error next=anthropic/claude-haiku-4-5 detail=redacted local proof failure
    [model-fallback/decision] model fallback decision: decision=candidate_failed requested=anthropic/anthropic/claude-sonnet-4-6 candidate=anthropic/claude-haiku-4-5 reason=server_error next=openai/gpt-4o detail=redacted local proof failure
    [model-fallback/decision] model fallback decision: decision=candidate_failed requested=anthropic/anthropic/claude-sonnet-4-6 candidate=openai/gpt-4o reason=server_error next=xai/grok-4 detail=redacted local proof failure
    [model-fallback/decision] model fallback decision: decision=candidate_failed requested=anthropic/anthropic/claude-sonnet-4-6 candidate=xai/grok-4 reason=server_error next=none detail=redacted local proof failure
    runAttempts: anthropic/claude-sonnet-4-6, anthropic/claude-haiku-4-5, openai/gpt-4o, xai/grok-4
    containsLeakedAnthropicPrefix: false
    
  • Observed result after fix: normalizeStaticProviderModelId("anthropic", "anthropic/claude-haiku-4-5") returns claude-haiku-4-5, fallback candidates resolve to claude-sonnet-4-6, claude-haiku-4-5, and claude-opus-4-7, and the real fallback runtime receives unprefixed Anthropic model ids while keeping OpenAI/XAI candidates separate.

  • What was not tested: A live gateway failover run against production Anthropic/OpenAI/Google/XAI credentials was not run from this workstation.

  • Proof limitations or environment constraints: This uses a local runtime invocation with redacted synthetic provider failures plus repository tests; the issue's production logs and production provider credentials are not available in this local environment.

  • Before evidence (optional but encouraged): Issue [Bug]: v2026.5.28 — fallback iterator leaks one candidate's modelId into every subsequent provider lookup; produces doubled-prefix errors fleet-wide #88560 documents production failures where all fallback candidates were reported with leaked ids such as openai/anthropic/claude-haiku-4-5, xai/anthropic/claude-haiku-4-5, and anthropic/anthropic/claude-haiku-4-5.

Tests and validation

  • timeout 180 node scripts/test-projects.mjs packages/model-catalog-core/src/provider-model-id-normalization.test.ts src/agents/model-ref-shared.test.ts src/agents/embedded-agent-runner/model.test.ts src/agents/model-fallback.test.ts
  • timeout 240 node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.test.src.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/test-src-88560-rebased-f7a1.tsbuildinfo
  • timeout 240 node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.test.packages.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/test-packages-88560-rebased-f7a1.tsbuildinfo
  • node --import tsx -e '<inline script invoking runWithModelFallback with prefixed Anthropic primary/fallbacks and redacted local FailoverError failures>'
  • git diff --check

Regression coverage was added for Anthropic prefix stripping, configured provider catalog normalization, fallback candidate chain normalization, and embedded model lookup arguments.

Risk checklist

Did user-visible behavior change? (Yes/No)

Yes. Prefixed Anthropic model ids now resolve to the native Anthropic model id for static catalog lookup.

Did config, environment, or migration behavior change? (Yes/No)

No.

Did security, auth, secrets, network, or tool execution behavior change? (Yes/No)

No.

What is the highest-risk area?

Anthropic model-id normalization compatibility for existing configs.

How is that risk mitigated?

The change is scoped to the native anthropic provider prefix and preserves existing Anthropic aliases; tests cover both the shared package and OpenClaw wrapper paths.

Current review state

What is the next action?

Await ClawSweeper re-review and fresh CI after the rebase/proof update.

What is still waiting on author, maintainer, CI, or external proof?

CI, ClawSweeper, and maintainer review. No known author-side blocker after this proof update.

Which bot or reviewer comments were addressed?

Addressed ClawSweeper's request for redacted runtime fallback proof showing prefixed Anthropic entries resolving to their own provider/model ids, then refreshed that proof after rebasing to f7a1d3f3f6.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 31, 2026
@clawsweeper

clawsweeper Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 31, 2026, 6:52 AM ET / 10:52 UTC.

Summary
The branch strips native anthropic/ prefixes during Anthropic static model-id normalization and adds regression coverage across catalog normalization, the OpenClaw wrapper, embedded static lookup, and fallback candidate resolution.

PR surface: Source +5, Tests +112. Total +117 across 5 files.

Reproducibility: yes. Source inspection on current main shows native Anthropic-prefixed ids are returned unchanged while the bundled Anthropic static catalog stores unprefixed ids, and the linked issue supplies production fallback logs showing doubled/leaked prefixes.

Review metrics: 1 noteworthy metric.

  • Built-in provider normalization policies: 1 changed, 0 added, 0 removed. The changed Anthropic normalization branch affects configured model and fallback resolution before provider lookup, so maintainers should notice the compatibility surface before merge.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] Provider/model normalization is compatibility-sensitive: existing native Anthropic config entries containing anthropic/<model> will now resolve to the unprefixed Anthropic model id before static catalog and provider lookup.
  • [P1] The supplied after-fix proof exercises the real local runWithModelFallback path with redacted synthetic provider failures, but it is not a live gateway failover against production Anthropic/OpenAI/XAI credentials.

Maintainer options:

  1. Merge With Explicit Compatibility Acceptance (recommended)
    The focused tests, current source path, and local runtime proof support landing once maintainers accept that native Anthropic-prefixed config entries now normalize to unprefixed Anthropic ids.
  2. Request Credentialed Gateway Failover Proof
    If production parity is required before changing provider/model normalization, ask for a maintainer or contributor run with real provider credentials and redacted logs.
  3. Pause For Broader Alias Policy
    If maintainers want this to be part of a cross-provider alias policy instead of a native Anthropic repair, pause this PR and move the policy decision to a separate design thread.

Next step before merge

  • No automated repair is needed; maintainers should make the merge decision for the scoped compatibility and provider-routing risk.

Security
Cleared: No concrete security or supply-chain regression was found; the diff changes provider/model normalization logic and tests only.

Review details

Best possible solution:

Land the shared normalization fix after maintainers accept the scoped Anthropic provider-routing compatibility risk; keep broader provider alias policy outside this PR.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main shows native Anthropic-prefixed ids are returned unchanged while the bundled Anthropic static catalog stores unprefixed ids, and the linked issue supplies production fallback logs showing doubled/leaked prefixes.

Is this the best way to solve the issue?

Yes. The PR fixes the shared normalization layer that feeds catalog, wrapper, embedded lookup, and fallback paths, which is narrower and cleaner than adding fallback-specific string handling.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 1e08af453a06.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes redacted after-fix output from the real local runWithModelFallback path plus focused tests and type checks; live gateway credentials were not tested but contributor-side proof is sufficient for this change.
  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

  • P1: The linked regression breaks fallback/provider routing for real agent workflows on the current stable release family.
  • merge-risk: 🚨 compatibility: The PR changes how existing native Anthropic-prefixed config/model ids are normalized during upgrades and runtime lookup.
  • merge-risk: 🚨 auth-provider: The diff changes provider/model routing for Anthropic fallback candidates before provider runtime and credential-backed lookup.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes redacted after-fix output from the real local runWithModelFallback path plus focused tests and type checks; live gateway credentials were not tested but contributor-side proof is sufficient for this change.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes redacted after-fix output from the real local runWithModelFallback path plus focused tests and type checks; live gateway credentials were not tested but contributor-side proof is sufficient for this change.
Evidence reviewed

PR surface:

Source +5, Tests +112. Total +117 across 5 files.

View PR surface stats
Area Files Added Removed Net
Source 1 6 1 +5
Tests 4 112 0 +112
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 5 118 1 +117

What I checked:

Likely related people:

  • steipete: Recent GitHub path history shows steipete authored the model-catalog-core extraction, normalization-core extraction, OpenAI provider identity refactor, and Anthropic model support work around the affected normalization and provider/catalog surfaces. (role: recent model catalog and provider routing contributor; confidence: high; commits: 30e1556cdac9, 00d8d7ead059, d92b3b5cc2bd; files: packages/model-catalog-core/src/provider-model-id-normalization.ts, src/agents/model-selection-shared.ts, extensions/anthropic/openclaw.plugin.json)
  • zhangguiping-xydt: Recent history for src/agents/model-fallback.ts includes a candidate-scoped fallback error fix, which is adjacent to the reported leaked-candidate behavior. (role: fallback behavior contributor; confidence: medium; commits: d6b7fe8615be; files: src/agents/model-fallback.ts)
  • vincentkoc: Recent history for the fallback path includes a fallback provider resolution cache change, making this person relevant for candidate-chain behavior and cache interactions. (role: fallback resolution contributor; confidence: medium; commits: 3c8d101f5a85; files: src/agents/model-fallback.ts)
  • chen-zhang-cs-code: Recent history for src/agents/model-fallback.ts includes cron preflight fallback handling work that touched the same fallback candidate surface. (role: recent fallback workflow contributor; confidence: medium; commits: 7a381b807e23; files: src/agents/model-fallback.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels May 31, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the fix/fallback-modelid-normalization-88560 branch from 509a16f to 0389ece Compare May 31, 2026 10:26

Copy link
Copy Markdown
Contributor Author

Updated the PR body with redacted local runtime fallback proof from the real runWithModelFallback path after rebasing to 4b1e5b79435c.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 31, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the fix/fallback-modelid-normalization-88560 branch from 0389ece to 5965951 Compare May 31, 2026 10:39
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026
@TurboTheTurtle TurboTheTurtle force-pushed the fix/fallback-modelid-normalization-88560 branch from 5965951 to 4ee13a1 Compare May 31, 2026 10:45

Copy link
Copy Markdown
Contributor Author

Rebased again and refreshed the PR body/proof on head 4ee13a12c08d66f556d26237ef07bd9d99ea98df. Raw Pulls API now verifies maintainer_can_modify: true.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 31, 2026
@steipete steipete self-assigned this May 31, 2026
@steipete

Copy link
Copy Markdown
Contributor

Maintainer verification before landing #88587.

Behavior addressed: Native Anthropic fallback refs such as anthropic/claude-haiku-4-5 are normalized to the provider-local model id before static catalog and fallback lookup, preventing doubled anthropic/anthropic/... ids and cross-provider fallback attempts carrying the Anthropic-prefixed model id.

Real environment tested: Maintainer source checkout on macOS, current origin/main after fast-forward to 5a0e67791fdc234c95c47e4945ae04b99769fef9; PR head 4ee13a12c08d66f556d26237ef07bd9d99ea98df.

Exact steps or command run after this patch: git status -sb; git pull --ff-only; gh pr view 88587 --repo openclaw/openclaw --json ...; gh pr diff 88587 --repo openclaw/openclaw --patch; git blame -L 105,121 -- packages/model-catalog-core/src/provider-model-id-normalization.ts; git fetch origin pull/88587/head:refs/remotes/origin/pr/88587; git merge-tree --write-tree origin/main origin/pr/88587; git diff --name-only f7a1d3f3f676f95b86468bbe911250b96084ea76..origin/main -- <touched files>; gh api repos/openclaw/openclaw/commits/4ee13a12c08d66f556d26237ef07bd9d99ea98df/check-runs --paginate.

Evidence after fix: PR CI completed successfully for the head SHA; only superseded duplicate auto-response / Real behavior proof runs were cancelled. ClawSweeper marked proof sufficient and ready for maintainer review. Local merge-tree against current origin/main produced a clean tree, and no PR-touched files changed on main since the PR base.

Observed result after fix: Source review confirms the production change is scoped to normalizeBuiltInProviderModelId("anthropic", ...), stripping the native anthropic/ prefix before alias/static lookup. The added tests cover catalog normalization, OpenClaw model-ref wrapper behavior, embedded static lookup arguments, and fallback candidate normalization.

What was not tested: I did not run a live gateway failover with real provider credentials from this machine, and I did not rerun the local Vitest/tsgo commands already supplied by the contributor and CI.

@steipete steipete merged commit 826b378 into openclaw:main May 31, 2026
164 of 168 checks passed
vincentkoc added a commit that referenced this pull request May 31, 2026
…n-rotation-current

* origin/main:
  docs: strengthen review dependency inspection rules
  refactor: expand acp core package (#88618)
  fix(doctor): diagnose malformed provider catalogs
  fix(agents): normalize prefixed Anthropic model ids (#88587)
  chore: bump OpenClaw version to 2026.5.31
  feat(codex): add portable Codex command pickers (#82224)
  fix(tui): preserve pending local runs during session sync (#87959)
  docs: clarify inline code comments
  fix(auto-reply): warn on substantive private message-tool finals
  fix(tui): use middle truncation for paths and commands in tool display (#88050)
  fix(webchat): suppress stale active session rows (#87962)
  fix(tui): skip history reload when final event has displayable output (#88004)
  test(discord): isolate timer-sensitive request tests
  fix(auth): coerce persisted device auth tokens
  fix(e2e): heartbeat resource-sampled docker lanes

# Conflicts:
#	src/gateway/server-methods/exec-approvals.ts
#	src/gateway/server-methods/nodes-pending.ts
#	src/infra/node-pairing.ts
#	src/tui/tui-command-handlers.test.ts
@TurboTheTurtle TurboTheTurtle deleted the fix/fallback-modelid-normalization-88560 branch May 31, 2026 19:26
@cjalden

cjalden commented May 31, 2026

Copy link
Copy Markdown

Thanks for the fast turnaround on this. Wanted to share an empirical data point that suggests the fix scope should be widened before merge.

TL;DR: I deployed a workaround for the issue on our pod by swapping the cost-tier model from anthropic/claude-haiku-4-5 to google/gemini-2.0-flash. Same exact bug fired — just with google/ instead of anthropic/:

Unknown model: anthropic/google/gemini-2.0-flash   (114 errors)
Unknown model: xai/google/gemini-2.0-flash         (76)
Unknown model: google/google/gemini-2.0-flash      (76)

Same shape as the Anthropic case: the prefixed model id leaks across fallback candidates, only the leading provider qualifier swaps per candidate.

Root cause confirmation: looking at the shipped normalizeBuiltInProviderModelId, only nvidia, openrouter, huggingface, and together have explicit self-prefix-strip branches. google goes through normalizeGooglePreviewModelId (which handles preview aliases but not redundant provider prefix), xai only rewrites specific reasoning-variant strings, and anthropic falls through unchanged — same gap your PR is patching.

Suggestion: would it be possible to generalize the strip rather than add an anthropic-specific branch? Something like a final pass that, regardless of provider, strips a leading ${provider}/ if the model id starts with it — covering anthropic + google + xai + any future provider without further per-provider branches. The fallback-iterator candidate-isolation fix (the second half of your PR) is already general; this would make the first half match it.

Happy to test against any candidate branch if you push one — our pod reproduces the leak deterministically on every heartbeat firing (~30/hour on each affected agent).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: v2026.5.28 — fallback iterator leaks one candidate's modelId into every subsequent provider lookup; produces doubled-prefix errors fleet-wide

3 participants