Skip to content

500/503 errors misclassified as rate_limit, triggering unnecessary cooldowns #22294

@DanRWilloughby

Description

@DanRWilloughby

Bug Description

OpenClaw gateway classifies Gemini 500 (InternalServerError) and 503 (ServiceUnavailable) responses as rate_limit errors, which triggers the exponential cooldown mechanism (1min → 5min → 25min → 60min cap). This effectively takes the agent offline even when API usage is well below rate limits.

Evidence

  • Gemini API dashboard shows usage at 5/25 RPM and 20/250 RPD (Paid Tier 1) — nowhere near limits
  • The actual errors in Google's dashboard are 500 InternalServerError, NOT 429 TooManyRequests
  • Both auth profiles (google-gemini-cli and anthropic fallback) entered cooldown simultaneously, leaving the agent with no working model
  • auth-profiles.json showed cooldownUntil set with failureCounts incrementing under the rate_limit category

Expected Behavior

  • 429 errors → trigger rate limit cooldown (correct)
  • 500/503 errors → retry with backoff but do NOT enter rate_limit cooldown state
  • Transient server errors should not disable the agent for extended periods

Actual Behavior

  • 500/503 errors → classified as rate_limit → exponential cooldown activated
  • Agent goes offline for up to 60 minutes due to server-side errors it has no control over
  • Both primary and fallback models can be simultaneously disabled

Impact

  • Agent becomes completely unresponsive during Gemini outages
  • Even brief Gemini instability (a few 500s) can trigger multi-minute cooldowns
  • Fallback model (Claude Sonnet) may also be in cooldown, leaving zero working models

Reproduction

  1. Configure gateway with google-gemini-cli as primary model
  2. Wait for Gemini to return a 500 or 503 error (happens periodically)
  3. Observe auth-profiles.jsonfailureCounts.rate_limit increments and cooldownUntil is set
  4. Agent stops making requests even though rate limits are not exceeded

Workaround

Manually clear cooldowns in auth-profiles.json:

python3 -c "
import json
with open('auth-profiles.json') as f:
    data = json.load(f)
for p in data.get('profiles', []):
    for key in ['cooldownUntil', 'errorCount', 'failureCounts', 'lastFailureAt']:
        if key in p:
            del p[key]
with open('auth-profiles.json', 'w') as f:
    json.dump(data, f, indent=2)
"

Environment

  • OpenClaw Gateway v2026.2.19
  • Google Gemini Paid Tier 1
  • Model: gemini-3-pro-preview

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleMarked as stale due to inactivity

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions