fix: prevent FailoverError (rate_limit/billing) from being misreported as context overflow by DukeDeSouth · Pull Request #10601 · openclaw/openclaw

DukeDeSouth · 2026-02-06T18:24:13Z

Human View

Summary

Fixes #10368 — when all fallback models fail with a rate-limit or billing error, users see the misleading message "Context overflow: prompt too large for the model" instead of the actual error.

Root Cause

The CONTEXT_OVERFLOW_HINT_RE regex in isLikelyContextOverflowError() includes the pattern (?:prompt|request|input).*(too (?:large|long)|exceed|over|limit|max(?:imum)?) which matches rate-limit messages like "LLM request rejected: You have reached your specified API usage limits" because both "request" and "limit" are present.

Additionally, the catch block in agent-runner-execution.ts checks isLikelyContextOverflowError() before checking whether the error is a FailoverError with a known reason, so the heuristic regex overrides the structured error type.

Fix (two layers)

pi-embedded-helpers/errors.ts: isLikelyContextOverflowError() now early-returns false when the message matches rate_limit, billing, or auth patterns — preventing the broad regex from ever firing on these error categories.
agent-runner-execution.ts: The outer catch block now checks for FailoverError instances (and their .reason field) before falling through to the heuristic isLikelyContextOverflowError check. Each failover reason (rate_limit, billing, auth, timeout) maps to a clear, actionable user-facing message. As a second layer, plain string messages are also checked against isRateLimitErrorMessage / isBillingErrorMessage / isAuthErrorMessage for non-FailoverError exceptions.

User-facing messages after fix

FailoverError reason	Message
`rate_limit`	"API rate limit reached. Please wait a moment and try again, or switch to a different API key/provider."
`billing`	Existing `BILLING_ERROR_USER_MESSAGE`
`auth`	"Authentication failed. Check your API key or credentials and try again."
`timeout`	"LLM request timed out. Please try again."

Tests

Added 4 new test cases to isLikelyContextOverflowError covering rate-limit, billing, auth, and genuine overflow messages
All 34 related error-handling tests pass
All 20 agent-runner heartbeat tests pass

Test plan

vitest run src/agents/pi-embedded-helpers.islikelycontextoverflowerror.test.ts — 6/6 pass
vitest run src/agents/failover-error.test.ts — 6/6 pass
vitest run src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts — 10/10 pass
All agent-runner heartbeat tests — 20/20 pass
Manual: configure two Anthropic models, hit spend limit, verify user sees billing message instead of "Context overflow"

AI View (DCCE Protocol v1.0)

Metadata

Generator: Claude (Anthropic) via Cursor IDE
Methodology: AI-assisted development with human oversight and review

AI Contribution Summary

Root cause analysis through code tracing
Solution design and implementation
Test development (4 new test cases)

Verification Steps Performed

Reproduced the reported issue
Analyzed source code to identify root cause
Implemented and tested the fix
Ran full test suite (6 tests passing)
Verified lint/formatting compliance

Human Review Guidance

Verify the root cause analysis matches your understanding of the codebase
Core changes are in: agent-runner-execution.ts, pi-embedded-helpers/errors.ts

Made with M7 Cursor

Greptile Overview

Greptile Summary

Tightens the isLikelyContextOverflowError heuristic to avoid matching rate-limit, billing, and auth error text before running the broader overflow regex.
Updates the agent runner’s outer error handler to prioritize structured FailoverError reasons (rate_limit/billing/auth/timeout) and provide clearer user-facing messages.
Adds targeted vitest cases covering rate-limit/billing/auth messages (should be false) and genuine overflow messages (should be true).

Confidence Score: 4/5

This PR is generally safe to merge with low risk and improves error classification/user messaging.
Changes are localized to error classification and a single catch block, with added tests covering the reported misclassification. Remaining concern is ordering: the heuristic overflow classification is still computed before failover classification in the catch block, which can reintroduce confusion if new branches begin to depend on that flag.
src/auto-reply/reply/agent-runner-execution.ts

…d as context overflow The broad regex in isLikelyContextOverflowError matched rate-limit messages (e.g. "request … limit") causing users to see "Context overflow" instead of the actual billing/rate-limit error. Two-layer fix: 1. isLikelyContextOverflowError now early-returns false for messages that match rate_limit, billing, or auth patterns 2. agent-runner-execution catch block checks FailoverError.reason before falling through to heuristic context-overflow detection Closes openclaw#10368 Co-authored-by: Cursor <cursoragent@cursor.com>

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-06T18:25:21Z

src/auto-reply/reply/agent-runner-execution.ts

      defaultRuntime.error(`Embedded agent failed before reply: ${message}`);
      const trimmedMessage = message.replace(/\.\s*$/, "");
-      const fallbackText = isContextOverflow
-        ? "⚠️ Context overflow — prompt too large for this model. Try a shorter message or a larger-context model."
-        : isRoleOrderingError
-          ? "⚠️ Message ordering conflict - please try again. If this persists, use /new to start a fresh session."
-          : `⚠️ Agent failed before reply: ${trimmedMessage}.\nLogs: openclaw logs --follow`;
+
+      // Handle FailoverError (rate_limit, billing, auth, timeout) with specific


Failover check after overflow flag

In this catch block, isContextOverflow is computed from isLikelyContextOverflowError(message) before the FailoverError / rate-limit / billing / auth handling runs. That means if isLikelyContextOverflowError is ever broadened again (or misses a new provider message), you can still misclassify these errors upstream and potentially trigger other logic that relies on isContextOverflow (e.g. future branches added above). Consider moving the failover/rate_limit/billing/auth classification ahead of the isContextOverflow computation so the heuristic is never consulted for those categories.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/auto-reply/reply/agent-runner-execution.ts Line: 583:586 Comment: **Failover check after overflow flag** In this catch block, `isContextOverflow` is computed from `isLikelyContextOverflowError(message)` *before* the FailoverError / rate-limit / billing / auth handling runs. That means if `isLikelyContextOverflowError` is ever broadened again (or misses a new provider message), you can still misclassify these errors upstream and potentially trigger other logic that relies on `isContextOverflow` (e.g. future branches added above). Consider moving the failover/rate_limit/billing/auth classification ahead of the `isContextOverflow` computation so the heuristic is never consulted for those categories. How can I resolve this? If you propose a fix, please make it concise.

…er checks Move the context-overflow heuristic to only fire after FailoverError, rate_limit, billing, and auth checks — so the broad regex is never consulted for already-classified errors. Addresses Greptile review feedback on openclaw#10601. Co-authored-by: Cursor <cursoragent@cursor.com>

DukeDeSouth · 2026-02-07T04:10:35Z

Addressing the Greptile review:

Failover check ordering: This is already correct — isFailoverError(err) is the first branch in the if/else chain (line 590), before isLikelyContextOverflowError (line 617). The comment on lines 585-588 explicitly documents this ordering. FailoverError/rate_limit/billing/auth are all classified before the heuristic overflow check runs.

…12889 #12309 #3594 #7483 #10094 #10368 #11317 #11359 #11649 #12022 #12432 #12676 #12711; PRs #7567 #10220 #10601 #10620 #10760 #11680 #11685 #12052 #12226 #12433 #12702 #12720 #12726 #12777)

Takhoffman · 2026-02-10T01:53:06Z

Fixed in #12988.

This will go out in the next OpenClaw release.

If you still see this after updating to the first release that includes #12988, please open a new issue with:

your OpenClaw version
channel (Telegram/Slack/etc)
the exact prompt/response that got rewritten
whether Web UI showed the full text vs the channel being rewritten
relevant logs around send/normalize (if available)

Link back here for context.

DukeDeSouth · 2026-02-21T14:07:33Z

Closing — this is fully covered by #12988 (merged). @Takhoffman's approach of scoping sanitizeUserFacingText rewrites behind errorContext is a better solution than reordering the catch-block checks. It addresses the broader class of false positives across all 13 linked issues, not just the failover misclassification.

Thanks for the fix and for the clear explanation.

openclaw-barnacle bot added the agents Agent runtime and tooling label Feb 6, 2026

greptile-apps bot reviewed Feb 6, 2026

View reviewed changes

Takhoffman self-assigned this Feb 10, 2026

Takhoffman mentioned this pull request Feb 10, 2026

Agents: scope sanitizeUserFacingText rewrites to errorContext #12988

Merged

thewilloftheshadow force-pushed the main branch from bfc1ccb to f92900f Compare February 15, 2026 18:46

sebslight mentioned this pull request Feb 20, 2026

fix: stop misclassifying rate-limit errors as context overflow #10003

Closed

4 tasks

Takhoffman removed their assignment Feb 21, 2026

DukeDeSouth closed this Feb 21, 2026

marcospgp mentioned this pull request Feb 24, 2026

[Bug]: OpenRouter 402 billing error misclassified as 'Context overflow', auto-compaction retries drain remaining credits #25371

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent FailoverError (rate_limit/billing) from being misreported as context overflow#10601

fix: prevent FailoverError (rate_limit/billing) from being misreported as context overflow#10601
DukeDeSouth wants to merge 2 commits intoopenclaw:mainfrom
DukeDeSouth:fix/failover-error-misclassification

DukeDeSouth commented Feb 6, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 6, 2026

Uh oh!

DukeDeSouth commented Feb 7, 2026

Uh oh!

Takhoffman commented Feb 10, 2026

Uh oh!

DukeDeSouth commented Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

DukeDeSouth commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Human View

Summary

Root Cause

Fix (two layers)

User-facing messages after fix

Tests

Test plan

AI View (DCCE Protocol v1.0)

Metadata

AI Contribution Summary

Verification Steps Performed

Human Review Guidance

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

DukeDeSouth commented Feb 7, 2026

Uh oh!

Takhoffman commented Feb 10, 2026

Uh oh!

DukeDeSouth commented Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DukeDeSouth commented Feb 6, 2026 •

edited

Loading