Summary
When zhipuai (GLM-5) returns error code 1310: Weekly/Monthly Limit Exhausted, the failover system does not recognize the error and the configured fallback models are never attempted. The agent run ends with the primary model's error even though fallback candidates (e.g. moonshot/kimi-for-coding) are healthy and available.
Steps to reproduce
- Configure
zhipuai/glm-5 as the primary model with fallback models in agents.defaults.model.fallbacks
- Exhaust the zhipuai weekly/monthly API quota
- Send a message to the agent
Expected: Fallback to the next configured model (e.g. moonshot)
Actual: Run ends with isError=true, no fallback triggered, user sees unresolved error
Evidence from logs
2026-03-03T06:15:47Z [agent/embedded] embedded run agent end: runId=930be05d isError=true error=LLM error 1310: Weekly/Monthly Limit Exhausted. Your limit will reset at 2026-03-06 22:19:54 (request_id: 20260303141547610b7f574d1b44cb)
This error repeated 20+ times over ~17 hours (06:15 UTC through 23:15 UTC on 2026-03-03), with the fallback model never attempted once.
Root cause
Two layers prevent fallback from triggering:
1. Pattern not recognized (failover-matches.ts)
ERROR_PATTERNS.rateLimit does not match "Weekly/Monthly Limit Exhausted":
// src/agents/pi-embedded-helpers/failover-matches.ts:4-15
rateLimit: [
/rate[_ ]limit|too many requests|429/,
"exceeded your current quota",
"resource has been exhausted", // "limit exhausted" — no match
"quota exceeded",
"resource_exhausted",
"usage limit", // "Limit Exhausted" — no match
...
]
classifyFailoverReason() returns null.
2. Inner runner skips rotation (run.ts)
Because the error is unclassified:
// src/agents/pi-embedded-runner/run.ts:1105
const failoverFailure = isFailoverAssistantError(lastAssistant); // → false
// run.ts:1142
const shouldRotate = (!aborted && failoverFailure) || (timedOut && !timedOutDuringCompaction);
// → false
shouldRotate is false, so the inner runner returns normally with the error payload. The outer runWithModelFallback loop treats this as a success and never tries the next candidate.
Suggested fix
Add patterns to ERROR_PATTERNS.rateLimit in src/agents/pi-embedded-helpers/failover-matches.ts:
rateLimit: [
...existing,
"limit exhausted",
/weekly\/monthly limit/i,
]
Related issues
Environment
- OpenClaw version: 2026.3.2
- Provider: zhipuai (api.z.ai proxy)
- Primary model: zhipuai/glm-5
- Fallback models: moonshot/kimi-for-coding, minimax/MiniMax-M2.5, google/gemini-3.1-pro-preview, anthropic/claude-opus-4-6
- OS: macOS
Summary
When zhipuai (GLM-5) returns error code
1310: Weekly/Monthly Limit Exhausted, the failover system does not recognize the error and the configured fallback models are never attempted. The agent run ends with the primary model's error even though fallback candidates (e.g. moonshot/kimi-for-coding) are healthy and available.Steps to reproduce
zhipuai/glm-5as the primary model with fallback models inagents.defaults.model.fallbacksExpected: Fallback to the next configured model (e.g. moonshot)
Actual: Run ends with
isError=true, no fallback triggered, user sees unresolved errorEvidence from logs
This error repeated 20+ times over ~17 hours (06:15 UTC through 23:15 UTC on 2026-03-03), with the fallback model never attempted once.
Root cause
Two layers prevent fallback from triggering:
1. Pattern not recognized (
failover-matches.ts)ERROR_PATTERNS.rateLimitdoes not match"Weekly/Monthly Limit Exhausted":classifyFailoverReason()returnsnull.2. Inner runner skips rotation (
run.ts)Because the error is unclassified:
shouldRotateisfalse, so the inner runner returns normally with the error payload. The outerrunWithModelFallbackloop treats this as a success and never tries the next candidate.Suggested fix
Add patterns to
ERROR_PATTERNS.rateLimitinsrc/agents/pi-embedded-helpers/failover-matches.ts:Related issues
"subscription quota limit"unrecognized). Closed as stale without a fix.insufficient_quotaunrecognized due to HTTP 400 masking + missing keywords (open)stop_reason: "error"unrecognized (fixed)Environment