-
-
Notifications
You must be signed in to change notification settings - Fork 79.2k
Empty claude-cli subprocess responses misclassified as billing cooldown #83231
Copy link
Copy link
Closed
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.ClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.Auth, provider routing, model choice, or SecretRef resolution may break.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.Good issue quality with a plausible reproduction path needing some confirmation.
Metadata
Metadata
Assignees
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.ClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.Auth, provider routing, model choice, or SecretRef resolution may break.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.Good issue quality with a plausible reproduction path needing some confirmation.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Version
OpenClaw
2026.5.12(buildf066dd2), Nodeclaude-cliprovider, Linux.Summary
When the bundled Claude CLI subprocess returns a zero-token, no-text completion (no error, no abort, no timeout), the provider classifier records it as a
billingfailure in~/.openclaw/agents/main/agent/auth-state.json. Three such responses trip a cooldown on the profile, after which every subsequent run on that profile aborts in ~300 ms with no model call and no trajectory file.This is not a real billing/wallet condition — the user's Claude account is funded and other clients on the same account succeed.
Reproduction signature
After a run completes with
status: "success"but an empty assistant reply,auth-state.jsonshows:The corresponding trajectory's
model.completedevent has:After three such events, the profile is cooled down:
Subsequent runs fail in ~300 ms with:
and no
<sessionId>.jsonl/.trajectory.jsonlis created — the run aborts before any model call.Expected
A zero-token, no-text, no-error response from the Claude CLI subprocess should not be classified as a billing failure. Either:
empty-response) and apply a separate cooldown policy, orActual
The classifier maps the empty response to
billing, the cooldown trips after 3 consecutive empty responses (which happen organically during normal Discord/cron load), and all dependent jobs fail until the cooldown window elapses or the state file is manually edited.Workaround attempted
Adding a systemd
ExecStartPrethat clearsdisabledUntil/disabledReason/errorCount/failureCountsfromauth-state.jsonon gateway start works at startup, but is not sufficient — during a 16-minute window between gateway restart and a manual cron trigger, two ordinary Discord/cron sessions re-trippedfailureCounts.billingto 3.ExecStartPresnippet (for reference):Suggested fix locations
The classifier path that maps subprocess result → failure category should distinguish between:
billing(current behaviour, correct)usage.total_tokens == 0andassistantTexts == []→ new bucket (or no-op)Impact
Every cron job and Discord session that depends on
anthropic:claude-clibecomes unreliable after ~3 empty-response coincidences. Operators see this as "claude-cli is dead" or "billing issue" with no actual billing problem. The cooldown is invisible fromopenclaw health(it showsDiscord: configuredand gateway healthy) and only visible by readingauth-state.jsondirectly or seeing theFallbackSummaryErrorinopenclaw cron runs.