Bug
When a provider returns HTTP 200 but with finish_reason: error in the response body, the model-fallback system marks the attempt as candidate_succeeded and stops the fallback chain. The error is only detected later at agent_end, when it's too late to try the next candidate.
Expected behavior
finish_reason: error should be treated as a failed attempt, triggering the next fallback candidate (same as timeout or HTTP error).
Reproduction
From production logs (2026-04-02 07:30 UTC, runId 1eeb0fb6-6d00-49d4-9fde-563bbb97bcd1):
- Attempt 1 —
anthropic/claude-opus-4-6 → timeout (408) → candidate_failed ✅ correct
- Attempt 2 —
voidai/claude-opus-4-6 → HTTP 200, finish_reason: error → candidate_succeeded ❌ incorrect
- Fallback chain stops (2/5 attempts used)
embedded_run_agent_end fires with isError: true, error: Provider finish_reason: error
- User sees generic error message instead of a response
Log evidence
// Step 2: fallback considers it success
{"event":"model_fallback_decision", "decision":"candidate_succeeded", "candidateProvider":"voidai", "attempt":2, "total":5}
// Step 4: but agent_end knows it failed
{"event":"embedded_run_agent_end", "isError":true, "error":"Provider finish_reason: error", "provider":"voidai"}
Suggested fix
In the fallback decision logic, check finish_reason from the provider response. If finish_reason === "error", treat as candidate_failed and continue to the next candidate.
Environment
- OpenClaw v2026.4.1
- Fallback chain: anthropic → voidai → (3 more candidates configured but never reached)
Bug
When a provider returns HTTP 200 but with
finish_reason: errorin the response body, the model-fallback system marks the attempt ascandidate_succeededand stops the fallback chain. The error is only detected later atagent_end, when it's too late to try the next candidate.Expected behavior
finish_reason: errorshould be treated as a failed attempt, triggering the next fallback candidate (same as timeout or HTTP error).Reproduction
From production logs (2026-04-02 07:30 UTC, runId
1eeb0fb6-6d00-49d4-9fde-563bbb97bcd1):anthropic/claude-opus-4-6→ timeout (408) →candidate_failed✅ correctvoidai/claude-opus-4-6→ HTTP 200,finish_reason: error→candidate_succeeded❌ incorrectembedded_run_agent_endfires withisError: true, error:Provider finish_reason: errorLog evidence
Suggested fix
In the fallback decision logic, check
finish_reasonfrom the provider response. Iffinish_reason === "error", treat ascandidate_failedand continue to the next candidate.Environment