fix: include configured fallback chain when running non-primary model by Taskle · Pull Request #25922 · openclaw/openclaw

Taskle · 2026-02-25T00:42:01Z

Summary

When a session runs a non-primary model (e.g. after failover from Claude to Codex), resolveFallbackCandidates() skips the configured fallback chain and only adds the configured primary as a fallback. If that primary's provider is still in cooldown and at candidate index >0 (not eligible for probing), all candidates are exhausted — creating a dead end with no recovery.

Fixes #25912

Problem

In src/agents/model-fallback.ts, this guard discards the entire fallback chain for non-primary models:

if (!sameModelCandidate(normalizedPrimary, configuredPrimary)) {
  return []; // Override model failed → go straight to configured default
}

This was intended for explicit --model overrides, but it also fires when the session is running a failover model. The result:

Claude (primary) hits rate limit → session fails over to Codex
Codex encounters an error → resolveFallbackCandidates() returns only Claude as a fallback
Claude is at candidate index 1 (not 0), so shouldProbePrimaryDuringCooldown returns false
All candidates exhausted → hard failure, no recovery path

Fix

Remove the early return and always include the configured fallback chain. The createModelCandidateCollector already deduplicates by provider+model, so there's no risk of duplicate candidates. The fallbacksOverride path (for explicit spawn overrides) is preserved and takes priority.

Before

Non-primary model fails → candidates: [currentModel, configuredPrimary]

After

Non-primary model fails → candidates: [currentModel, ...configuredFallbacks, configuredPrimary]

Changes

src/agents/model-fallback.ts: Remove sameModelCandidate guard and configuredPrimary variable (both now unused). Replace with comment explaining the design decision.
src/agents/model-fallback.test.ts: Update 5 tests to reflect new behavior — override models now fall back through the configured chain instead of jumping straight to primary. Remove createOverrideFailureRun helper (no longer needed). All 30 tests pass.

Testing

npx vitest run src/agents/model-fallback.test.ts
# ✓ 30 tests passed

Edge cases considered

Primary running, primary fails: candidates unchanged — [primary, ...fallbacks, primary(deduped)]
Override model running, override fails: improved — now tries [override, ...fallbacks, primary] instead of [override, primary]
Fallback model running (after failover), fallback fails: fixed — [fallback, ...otherFallbacks, primary] instead of [fallback, primary]
fallbacksOverride set (explicit spawn): unchanged — takes priority before this code path
Allowlist enforcement: unchanged — fallback candidates use enforceAllowlist: true, primary/override use false

Greptile Summary

Removes early return that skipped the configured fallback chain when running non-primary models, fixing a dead-end scenario where sessions could fail to recover after failover. The fix ensures all models in the configured fallback chain remain reachable even when running override or failover models, while deduplication in createModelCandidateCollector prevents duplicate candidates.

Removed unused sameModelCandidate function and configuredPrimary variable (both became unnecessary after removing the guard)
Replaced removed logic with explanatory comment documenting the design decision
Updated 5 tests to reflect new behavior where override models fall back through the configured chain instead of jumping directly to the primary
Removed createOverrideFailureRun test helper (no longer needed with updated test approach)
All 30 tests reported passing in PR description

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The change is well-reasoned and addresses a clear bug in the fallback logic. The removal of the early return is straightforward, and deduplication ensures no duplicate candidates are added. Test updates properly reflect the new behavior, with all tests passing. The change improves robustness by preventing dead-end scenarios during failover recovery.
No files require special attention

_{Last reviewed commit: 61b9dcc}

When a session runs a non-primary model (e.g. after failover), resolveFallbackCandidates() previously returned an empty fallback list. This meant only the configured primary was available as a fallback. If the primary provider was in cooldown and at candidate index >0 (not probe-eligible), all candidates would be exhausted with no recovery path — creating a dead end after failover. Now the full configured fallback chain is included regardless of whether the current model matches the configured primary, giving non-primary sessions the same resilience as primary sessions. Fixes openclaw#25912

…els (#25922)

steipete · 2026-02-25T01:46:39Z

Reviewed and landed on main as commit bf5a96ad6 with a scoped reimplementation.

What shipped:

src/agents/model-fallback.ts
- Preserves configured fallback-chain traversal when the current run model is itself one of the configured fallbacks (post-failover path).
- Keeps legacy behavior for ad-hoc override models outside the configured chain (still collapses to primary-only fallback).
- Result: avoids dead-end candidate sets during fallback-on-fallback retries while minimizing behavior expansion.
src/agents/model-fallback.test.ts
- Added regression: when current model is a configured fallback and fails, resolver continues through remaining configured fallbacks.
CHANGELOG.md
- Added user-facing fix note under 2026.2.24 (Unreleased).

Validation:

Full gate passed: pnpm lint && pnpm build && pnpm test.
Focused tests passed: pnpm test src/agents/model-fallback.test.ts src/agents/model-fallback.probe.test.ts.

Thanks for the issue analysis and patch direction, @Taskle.

steipete · 2026-02-25T01:46:43Z

Landed on main in bf5a96a with scoped fallback-chain fix + regression coverage; closing in favor of landed commit.

…els (#25922)

…els (openclaw#25922)

openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Feb 25, 2026

gumadeiras self-assigned this Feb 25, 2026

steipete added a commit that referenced this pull request Feb 25, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

bf5a96a

…els (#25922)

steipete closed this Feb 25, 2026

steipete mentioned this pull request Feb 25, 2026

Fallback chain empty when session runs non-primary model (dead end after failover) #25912

Closed

github-actions bot mentioned this pull request Feb 25, 2026

📡 Upstream Digest — 2026-02-25 04:12 UTC curtismercier/openclaw-mods#121

Open

joshavant pushed a commit that referenced this pull request Feb 25, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

bf3c740

…els (#25922)

margulans pushed a commit to margulans/Neiron-AI-assistant that referenced this pull request Feb 25, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

8e76f17

…els (openclaw#25922)

pennyworth100 mentioned this pull request Feb 25, 2026

Agents/Failover: HTTP 529 (Anthropic overloaded) should trigger fallback chain #26578

Closed

Jackson3195 pushed a commit to Jackson3195/openclaw-with-a-personal-touch that referenced this pull request Feb 25, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

601d1ef

…els (openclaw#25922)

brianleach pushed a commit to brianleach/openclaw that referenced this pull request Feb 26, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

a5dd3f6

…els (openclaw#25922)

arjunaskykok mentioned this pull request Feb 26, 2026

fix/test a2ui bundle preflight #27345

Open

2 tasks

execute008 pushed a commit to execute008/openclaw that referenced this pull request Feb 27, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

d8f25ec

…els (openclaw#25922)

r4jiv007 pushed a commit to r4jiv007/openclaw that referenced this pull request Feb 28, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

6934d8e

…els (openclaw#25922)

zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

db13728

…els (openclaw#25922)

thebenjaminlee pushed a commit to escape-velocity-ventures/openclaw that referenced this pull request Mar 7, 2026

fix(agents): keep fallback chain reachable on configured fallback mod…

065661e

…els (openclaw#25922)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: include configured fallback chain when running non-primary model#25922

fix: include configured fallback chain when running non-primary model#25922
Taskle wants to merge 1 commit intoopenclaw:mainfrom
Taskle:fix/model-fallback-chain

Taskle commented Feb 25, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

steipete commented Feb 25, 2026

Uh oh!

steipete commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Taskle commented Feb 25, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Fix

Before

After

Changes

Testing

Edge cases considered

Greptile Summary

Confidence Score: 5/5

Uh oh!

steipete commented Feb 25, 2026

Uh oh!

steipete commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Taskle commented Feb 25, 2026 •

edited by greptile-apps bot

Loading