Skip to content

fix(fallback): treat finish_reason:error as candidate_failed (#59524)#59680

Closed
ayeshakhalid192007-dev wants to merge 1 commit intoopenclaw:mainfrom
ayeshakhalid192007-dev:fix/fallback-finish-reason-error-59524
Closed

fix(fallback): treat finish_reason:error as candidate_failed (#59524)#59680
ayeshakhalid192007-dev wants to merge 1 commit intoopenclaw:mainfrom
ayeshakhalid192007-dev:fix/fallback-finish-reason-error-59524

Conversation

@ayeshakhalid192007-dev
Copy link
Copy Markdown
Contributor

Summary

When a provider returns HTTP 200 with finish_reason: error, the fallback loop
incorrectly logs candidate_succeeded and halts the chain. The remaining
candidates are never tried, and the user sees an error instead of a response.

Root Cause

runWithModelFallback detects failures exclusively via thrown exceptions.
runEmbeddedPiAgent resolves normally on finish_reason: error (HTTP 200 with
an error body), so the fallback loop sees { ok: true } and stops. The
result.meta.stopReason === "error" signal is only visible after the chain
has already terminated.

Trace from the reported production log (runId 1eeb0fb6):

  • Attempt 1: anthropic/claude-opus-4-6 → timeout (408) → candidate_failed
  • Attempt 2: voidai/claude-opus-4-6 → HTTP 200, finish_reason: errorcandidate_succeeded
  • Attempts 3–5: never reached

Changes

  • src/agents/agent-command.ts — check result.meta.stopReason === "error" after runAgentAttempt resolves; throw FailoverError(reason: "overloaded") to re-enter the fallback loop
  • src/auto-reply/reply/agent-runner-execution.ts — same check in the embedded Pi run path
  • src/cron/isolated-agent/run.ts — same check in the cron isolated-agent run path
  • src/agents/model-fallback.test.ts — regression test confirming the chain continues past a FailoverError thrown for finish_reason: error

Why This Is Safe

  • Only triggers on stopReason === "error", which is an unambiguous provider-side failure signal
  • Uses reason: "overloaded" — retry-eligible, no auth/billing cooldown side-effects
  • Mirrors the existing pattern for LiveSessionModelSwitchError (wrapped as FailoverError(reason: "overloaded") in model-fallback.ts)
  • No changes to runEmbeddedPiAgent or model-fallback.ts internals

Testing

Regression test in model-fallback.test.ts:

  • Mocks first candidate throwing FailoverError("Provider finish_reason: error", { reason: "overloaded" })
  • Asserts second candidate is tried and succeeds
  • Asserts attempts[0] records the correct provider, model, reason, and error message

Fixes #59524


AI-assisted PR — changes reviewed and understood. Lightly tested via unit test.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Apr 2, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

This PR fixes a fallback-chain regression where a provider returning HTTP 200 with finish_reason: error was incorrectly treated as success, causing the fallback loop to halt and return an error to the user. The fix is applied consistently across three execution paths (agent-command.ts, agent-runner-execution.ts, cron/isolated-agent/run.ts) by checking result.meta.stopReason === "error" and throwing a FailoverError to re-enter the fallback loop. A regression test for runWithModelFallback validates the end-to-end behavior.

Confidence Score: 5/5

  • Safe to merge — the fix is minimal, well-targeted, and consistent across all three execution paths.
  • EmbeddedPiRunResult.meta is a required (non-optional) field per the type definition in pi-embedded-runner/types.ts, so the direct result.meta.stopReason access is safe. stopReason is string | undefined, and comparing it to "error" correctly evaluates to false when it is absent. The surrounding optional-chaining usage (result.meta?.systemPromptReport) is purely defensive against stale callers, not evidence of a nullable type. The fallback loop already handles FailoverError correctly (it is caught in runFallbackCandidate, recorded, and the chain continues), so the new throw integrates cleanly with no changes to model-fallback.ts internals. No P0 or P1 findings.
  • No files require special attention.

Reviews (1): Last reviewed commit: "fix(fallback): treat finish_reason:error..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 31b36d3015

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +596 to +598
if (result.meta.stopReason === "error") {
throw new FailoverError("Provider finish_reason: error", {
reason: "overloaded",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve provider error context when converting stopReason errors

This branch always throws FailoverError("Provider finish_reason: error", { reason: "overloaded" }), which discards the original assistant/provider error details from runEmbeddedPiAgent. In single-candidate runs (or when all candidates fail), downstream handlers only see this synthetic overload error, so billing/rate-limit/role-ordering classification and user-facing recovery messaging regress to a generic failure instead of the real cause. Wrap using the underlying error text/reason (when present) rather than hardcoding overloaded.

Useful? React with 👍 / 👎.

@ayeshakhalid192007-dev ayeshakhalid192007-dev force-pushed the fix/fallback-finish-reason-error-59524 branch from 31b36d3 to 57250d6 Compare April 4, 2026 01:14
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 57250d6124

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/agent-command.ts Outdated
Comment on lines +814 to +816
throw new FailoverError("Provider finish_reason: error", {
reason: "overloaded",
provider: providerOverride,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Classify finish_reason errors before forcing overloaded retries

This branch converts every result.meta.stopReason === "error" into FailoverError(... reason: "overloaded"), which misclassifies non-overload failures as transient retryable overloads. For example, openai-transport-stream.ts maps finish_reason: "content_filter" to stopReason: "error"; with this change, that policy block now gets retried on fallback candidates instead of being surfaced, and downstream logic records the wrong failure reason. Please derive the failover reason from the provider error payload/message (or only map known transient reasons) before triggering fallback.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 329a48c2ba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/cron/isolated-agent/run.ts Outdated
Comment on lines +594 to +597
if (result.meta.stopReason === "error") {
throw new FailoverError("Provider finish_reason: error", {
reason: "overloaded",
provider: providerOverride,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Derive failover reason from stopReason error payload

This branch converts every result.meta.stopReason === "error" into FailoverError(..., { reason: "overloaded" }), but stopReason: "error" is also used for non-transient failures (for example content_filter in src/agents/openai-transport-stream.ts:1304-1306). In cron isolated runs this misclassifies policy/validation errors as transient overloads, which can trigger inappropriate fallback retries and surfaces the wrong failure reason when candidates are exhausted. Please preserve the provider error context (or only map known transient cases) before forcing overloaded.

Useful? React with 👍 / 👎.

@ayeshakhalid192007-dev ayeshakhalid192007-dev force-pushed the fix/fallback-finish-reason-error-59524 branch from 329a48c to fd5f637 Compare April 4, 2026 01:36
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fd5f637997

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/agent-command.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: baf3848057

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +824 to +830
if (result.meta.stopReason === "error") {
const errorText =
result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
throw new FailoverError(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip fallback retry after streaming reply content

This unconditional stopReason === "error" conversion runs after the attempt may already have emitted user-visible output through earlier callbacks in this same run path (onPartialReply/onToolResult). When a provider streams partial text and then terminates with finish_reason:error (for example, a late content filter), this throw triggers runWithModelFallback to retry another candidate, so the user can receive mixed/duplicated replies from multiple attempts and repeated side effects. The retry should be gated on whether the failed attempt already produced outbound content/tool sends.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 43b602924c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat unknown finish_reason errors as non-transient failures

This fallback reason default (classifyFailoverReason(errorText) ?? "overloaded") misclassifies unrecognized stopReason === "error" cases as transient overloads. For example, OpenAI finish_reason: "content_filter" is mapped to "Provider finish_reason: content_filter" in src/agents/openai-transport-stream.ts, which does not match current classifiers, so this path retries other candidates and reports an overload instead of surfacing a moderation/policy failure. That changes both retry behavior and user-facing error semantics for non-transient provider stops.

Useful? React with 👍 / 👎.

@ayeshakhalid192007-dev ayeshakhalid192007-dev force-pushed the fix/fallback-finish-reason-error-59524 branch from 43b6029 to e936d60 Compare April 4, 2026 14:47
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e936d60c0f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +176 to +182
if (result.meta.stopReason === "error") {
const errorText =
result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
throw new FailoverError(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prevent retry after side-effecting stopReason:error attempt

In the cron isolated-agent fallback path, this unconditional stopReason === "error" conversion throws before the caller can evaluate runResult side-effect flags. runEmbeddedPiAgent can complete tool actions (for example message sends or cron mutations) and still return with an error stop reason from the final assistant step; throwing here makes runWithModelFallback run another candidate after those effects already happened, so scheduled runs can emit duplicate outbound messages or duplicate cron changes. Please gate this retry on whether the failed attempt already produced side effects (for example didSendViaMessagingTool / cron-add metadata) before re-entering fallback.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a038fcc037

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/agent-command.ts Outdated
Comment on lines +814 to +820
if (result.meta.stopReason === "error") {
const errorText =
result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
throw new FailoverError(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate finish_reason retries after side-effecting runs

This unconditional stopReason === "error" conversion retries fallback even when the just-finished attempt may already have executed irreversible actions. runAgentAttempt can return results that include side-effect markers (for example didSendViaMessagingTool / successfulCronAdds from embedded runs), and throwing here re-enters runWithModelFallback, which can send duplicate outbound messages or repeat cron mutations on the next candidate. The retry should be skipped (or handled differently) once the failed attempt has already produced side effects.

Useful? React with 👍 / 👎.

@ayeshakhalid192007-dev ayeshakhalid192007-dev force-pushed the fix/fallback-finish-reason-error-59524 branch from a038fcc to d84b2e5 Compare April 4, 2026 19:06
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fe86c6eca7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pass provider into failover reason classifier

classifyFailoverReason(errorText) is invoked without the candidate provider, which disables provider-specific classifiers (for example isOpenRouterKeyLimitExceededError in pi-embedded-helpers/errors.ts only matches when provider contains openrouter). In this path, those errors fall through to null and then get forced to "overloaded", so billing/auth-like failures are treated as transient overload retries and logged with the wrong reason. This regresses fallback behavior and cooldown handling for provider-specific error formats.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codex review!!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fe86c6eca7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +819 to +820
throw new FailoverError(
errorText || "Provider finish_reason: error",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid turning terminal finish_reason errors into hard failures

Throwing here converts every stopReason === "error" result into an exception, but agentCommandInternal does not translate fallback exhaustion back into a reply payload. In single-candidate configs (or when all candidates fail), runWithModelFallback rethrows this FailoverError, and ingress paths that call agentCommandFromIngress (for example src/gateway/openai-http.ts and src/gateway/openresponses-http.ts) treat it as an internal failure and return generic 500s. Before this change, the same provider-side error surfaced as a structured error payload from the run result.

Useful? React with 👍 / 👎.

result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass provider context when classifying failover reason

classifyFailoverReason is called without the provider even though providerOverride is available, so provider-specific generic errors are misclassified (or fall back to "overloaded"). That changes recorded attempt reasons and downstream fallback policy/message behavior (cooldown probing and summary text), especially for provider patterns that only resolve with provider context.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5b4fe25f6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/cron/isolated-agent/run-executor.ts
@ayeshakhalid192007-dev ayeshakhalid192007-dev force-pushed the fix/fallback-finish-reason-error-59524 branch from e5b4fe2 to 866dd09 Compare April 16, 2026 19:57
@ayeshakhalid192007-dev
Copy link
Copy Markdown
Contributor Author

Rebased on current main (8c11210) to resolve CI failures.

Changes after rebase:

  • src/agents/agent-command.ts — adapted fix to use upstream's refactored attemptExecutionRuntime.runAgentAttempt; same stopReason === "error" check preserved
  • src/cron/isolated-agent/run-executor.ts — upstream extracted fallback logic into this new file; applied identical fix here (was missing in original PR since file didn't exist at branch point)
  • src/auto-reply/reply/agent-runner-execution.ts — no conflict; fix carried forward cleanly
  • src/cron/isolated-agent/run.ts — fallback logic was moved upstream; dropped our now-redundant block

Fixes #59524finish_reason:error from provider now throws FailoverError in all three fallback paths so the chain continues to the next candidate.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 866dd09d7d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

result.meta.error?.message ??
result.payloads?.find((p) => p.isError)?.text ??
"";
const derivedReason = classifyFailoverReason(errorText) ?? "overloaded";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass provider into finish_reason failover classification

classifyFailoverReason is called with only errorText, so provider-specific classifiers are skipped in the auto-reply fallback path. For errors whose detection depends on provider context (for example OpenRouter-specific key/credit patterns), this falls through to null and then gets coerced to "overloaded", which records the wrong failure reason and can drive incorrect retry/cooldown behavior when a run ends with stopReason === "error".

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@rajamukerji rajamukerji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good — bug-fix evidence bar met:

  • Symptom evidence: production logs in #59524
  • Root cause: runWithModelFallback detects failures only via thrown exceptions; runEmbeddedPiAgent resolves normally on finish_reason: error, so the chain halts despite the failure signal in result.meta.stopReason
  • Fix touches implicated path: three call sites (src/agents/agent-command.ts, src/auto-reply/reply/agent-runner-execution.ts, src/cron/isolated-agent/run-executor.ts)
  • Regression tests in src/agents/model-fallback.test.ts cover both the default overloaded path and a rate_limit-classified case

CI all green, rebased onto current main, mirrors the existing LiveSessionModelSwitchError → FailoverError(reason: "overloaded") pattern.

Optional follow-ups (not blocking this PR):

  1. The if (result.meta.stopReason === "error") { ... throw new FailoverError(...) } block is duplicated verbatim across the three call sites. A small helper (for example throwIfProviderStopReasonError(result, { provider, model }) exported from failover-error.ts) would keep the three paths in lockstep.
  2. classifyFailoverReason(errorText) is called without a provider argument. Per src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts, provider context matters for some classifications (for example Anthropic-specific billing error text). Falling back to overloaded is still safe, but passing { provider } would tighten the classification for provider-specific error surfaces.
  3. src/hooks/llm-slug-generator.ts also calls runEmbeddedPiAgent directly and does not get the same treatment. Slug generation is non-critical, so leaving it is defensible — but worth a follow-up note if you want the pattern applied uniformly.

…w#59524)

When a provider returns HTTP 200 with finish_reason:error, runWithModelFallback
incorrectly logged candidate_succeeded and halted the fallback chain. The runner
resolves normally in this case, so the fallback loop never saw a thrown error.

Fix: in all three runWithModelFallback call sites that use runEmbeddedPiAgent,
check result.meta.stopReason === "error" after the run resolves and throw a
FailoverError(reason: "overloaded") so the fallback loop continues to the next
candidate as expected.

Affected call sites:
- src/agents/agent-command.ts
- src/auto-reply/reply/agent-runner-execution.ts
- src/cron/isolated-agent/run.ts

Regression test added in src/agents/model-fallback.test.ts.
@steipete
Copy link
Copy Markdown
Contributor

Closing this as implemented after Codex review.

Current main has superseding provider-aware failover classification for embedded/runtime assistant errors, and the related hardening shipped in v2026.4.24 via #70743. This PR's direct call-site conversion is no longer the best patch to land.

What I checked:

  • Current main classifies provider/runtime failover signals centrally: classifyFailoverSignal derives failover classification from status, code, message, and provider context instead of blindly treating every stopReason: "error" as overloaded. (src/agents/pi-embedded-helpers/errors.ts:3756, 7bbd47349ec3)
  • Provider-specific error context is preserved: Current main includes provider-aware cases such as OpenRouter Provider returned error mapping to a timeout failover reason, addressing the review concern that the PR's earlier implementation could lose provider context or force the wrong reason. (src/agents/pi-embedded-helpers/errors.ts:3567, 7bbd47349ec3)
  • Fallback hardening shipped after this PR: The v2026.4.24 release notes list Codex/GPT-5.4: harden fallback, auth-profile, tool-schema, and replay edge cases across native and embedded runtime paths. (#70743), and [codex] Harden GPT-5.4 runtime paths #70743 cross-referenced this PR in the timeline. (cbcfdf62c729)
  • PR discussion identified risks that main avoids: Review comments on fix(fallback): treat finish_reason:error as candidate_failed (#59524) #59680 repeatedly flagged hardcoding or defaulting finish_reason:error to overloaded, missing provider-specific classification, and retrying after side effects. Current main's centralized classifier is the safer direction than landing the PR's call-site patch.

So I’m closing this as already implemented rather than keeping a duplicate issue open.

Codex Review notes: reviewed against 7bbd47349ec3; fix evidence: release v2026.4.24.

@steipete steipete closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model fallback treats finish_reason:error as success, stops fallback chain

3 participants