fix: prevent FailoverError (rate_limit/billing) from being misreported as context overflow#10601
fix: prevent FailoverError (rate_limit/billing) from being misreported as context overflow#10601DukeDeSouth wants to merge 2 commits intoopenclaw:mainfrom
Conversation
…d as context overflow The broad regex in isLikelyContextOverflowError matched rate-limit messages (e.g. "request … limit") causing users to see "Context overflow" instead of the actual billing/rate-limit error. Two-layer fix: 1. isLikelyContextOverflowError now early-returns false for messages that match rate_limit, billing, or auth patterns 2. agent-runner-execution catch block checks FailoverError.reason before falling through to heuristic context-overflow detection Closes openclaw#10368 Co-authored-by: Cursor <cursoragent@cursor.com>
| defaultRuntime.error(`Embedded agent failed before reply: ${message}`); | ||
| const trimmedMessage = message.replace(/\.\s*$/, ""); | ||
| const fallbackText = isContextOverflow | ||
| ? "⚠️ Context overflow — prompt too large for this model. Try a shorter message or a larger-context model." | ||
| : isRoleOrderingError | ||
| ? "⚠️ Message ordering conflict - please try again. If this persists, use /new to start a fresh session." | ||
| : `⚠️ Agent failed before reply: ${trimmedMessage}.\nLogs: openclaw logs --follow`; | ||
|
|
||
| // Handle FailoverError (rate_limit, billing, auth, timeout) with specific |
There was a problem hiding this comment.
Failover check after overflow flag
In this catch block, isContextOverflow is computed from isLikelyContextOverflowError(message) before the FailoverError / rate-limit / billing / auth handling runs. That means if isLikelyContextOverflowError is ever broadened again (or misses a new provider message), you can still misclassify these errors upstream and potentially trigger other logic that relies on isContextOverflow (e.g. future branches added above). Consider moving the failover/rate_limit/billing/auth classification ahead of the isContextOverflow computation so the heuristic is never consulted for those categories.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 583:586
Comment:
**Failover check after overflow flag**
In this catch block, `isContextOverflow` is computed from `isLikelyContextOverflowError(message)` *before* the FailoverError / rate-limit / billing / auth handling runs. That means if `isLikelyContextOverflowError` is ever broadened again (or misses a new provider message), you can still misclassify these errors upstream and potentially trigger other logic that relies on `isContextOverflow` (e.g. future branches added above). Consider moving the failover/rate_limit/billing/auth classification ahead of the `isContextOverflow` computation so the heuristic is never consulted for those categories.
How can I resolve this? If you propose a fix, please make it concise.…er checks Move the context-overflow heuristic to only fire after FailoverError, rate_limit, billing, and auth checks — so the broad regex is never consulted for already-classified errors. Addresses Greptile review feedback on openclaw#10601. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Addressing the Greptile review: Failover check ordering: This is already correct — |
|
Fixed in #12988. This will go out in the next OpenClaw release. If you still see this after updating to the first release that includes #12988, please open a new issue with:
Link back here for context. |
bfc1ccb to
f92900f
Compare
|
Closing — this is fully covered by #12988 (merged). @Takhoffman's approach of scoping Thanks for the fix and for the clear explanation. |
Human View
Summary
Fixes #10368 — when all fallback models fail with a rate-limit or billing error, users see the misleading message "Context overflow: prompt too large for the model" instead of the actual error.
Root Cause
The
CONTEXT_OVERFLOW_HINT_REregex inisLikelyContextOverflowError()includes the pattern(?:prompt|request|input).*(too (?:large|long)|exceed|over|limit|max(?:imum)?)which matches rate-limit messages like "LLM request rejected: You have reached your specified API usage limits" because both "request" and "limit" are present.Additionally, the catch block in
agent-runner-execution.tschecksisLikelyContextOverflowError()before checking whether the error is aFailoverErrorwith a known reason, so the heuristic regex overrides the structured error type.Fix (two layers)
pi-embedded-helpers/errors.ts:isLikelyContextOverflowError()now early-returnsfalsewhen the message matches rate_limit, billing, or auth patterns — preventing the broad regex from ever firing on these error categories.agent-runner-execution.ts: The outer catch block now checks forFailoverErrorinstances (and their.reasonfield) before falling through to the heuristicisLikelyContextOverflowErrorcheck. Each failover reason (rate_limit,billing,auth,timeout) maps to a clear, actionable user-facing message. As a second layer, plain string messages are also checked againstisRateLimitErrorMessage/isBillingErrorMessage/isAuthErrorMessagefor non-FailoverError exceptions.User-facing messages after fix
rate_limitbillingBILLING_ERROR_USER_MESSAGEauthtimeoutTests
isLikelyContextOverflowErrorcovering rate-limit, billing, auth, and genuine overflow messagesTest plan
vitest run src/agents/pi-embedded-helpers.islikelycontextoverflowerror.test.ts— 6/6 passvitest run src/agents/failover-error.test.ts— 6/6 passvitest run src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts— 10/10 passAI View (DCCE Protocol v1.0)
Metadata
AI Contribution Summary
Verification Steps Performed
Human Review Guidance
agent-runner-execution.ts,pi-embedded-helpers/errors.tsMade with M7 Cursor
Greptile Overview
Greptile Summary
isLikelyContextOverflowErrorheuristic to avoid matching rate-limit, billing, and auth error text before running the broader overflow regex.FailoverErrorreasons (rate_limit/billing/auth/timeout) and provide clearer user-facing messages.Confidence Score: 4/5