Plugin Type
VSCode Extension
Cline Version
latest (static analysis finding — not version-specific)
What happened?
src/core/api/retry.ts reads retry-after correctly for delay calculation, but all 429s enter the same retry path regardless of what the retry-after value signals.
A retry-after of 30 seconds means transient pressure — retry makes sense.
A retry-after of 3+ hours means quota exhaustion — retrying wastes the attempt budget and delays surfacing the real error to the user.
Current behavior: retry-after sets the delay but never prevents retry. A quota exhaustion 429 will wait 3 hours, retry, hit the same wall, wait 3 hours again, then fail on attempt 3.
Expected behavior: retry-after above a threshold should surface immediately as a hard stop, not consume retry attempts.
Suggested fix:
const STOP_THRESHOLD_MS = 5 * 60 * 1000 // 5 minutes
if (delay > STOP_THRESHOLD_MS) {
throw error // quota exhaustion — don't retry
}
Documented across production systems:
github.com/SirBrenton/pitstop-truth
Steps to reproduce
- Configure any provider with a quota-based rate limit (Anthropic, OpenAI, Gemini)
- Exhaust the quota — trigger a 429 with a long retry-after (e.g., retry-after: 11180000 for ~3 hours)
- Observe that withRetry() honors the delay but still retries up to maxRetries times
- Each retry hits the same quota exhaustion and waits the full delay again before the error surfaces
Provider/Model
- Configure any provider with a quota-based rate limit (Anthropic, OpenAI, Gemini) 2. Exhaust the quota — trigger a 429 with a long retry-after (e.g., retry-after: 11180000 for ~3 hours) 3. Observe that withRetry() honors the delay but still retries up to maxRetries times 4. Each retry hits the same quota exhaustion and waits the full delay again before the error surfaces
System Information
Any provider returning 429 with retry-after > 300 seconds
(confirmed: anthropic/claude-sonnet-4-6, openai)
Plugin Type
VSCode Extension
Cline Version
latest (static analysis finding — not version-specific)
What happened?
src/core/api/retry.ts reads retry-after correctly for delay calculation, but all 429s enter the same retry path regardless of what the retry-after value signals.
A retry-after of 30 seconds means transient pressure — retry makes sense.
A retry-after of 3+ hours means quota exhaustion — retrying wastes the attempt budget and delays surfacing the real error to the user.
Current behavior: retry-after sets the delay but never prevents retry. A quota exhaustion 429 will wait 3 hours, retry, hit the same wall, wait 3 hours again, then fail on attempt 3.
Expected behavior: retry-after above a threshold should surface immediately as a hard stop, not consume retry attempts.
Suggested fix:
const STOP_THRESHOLD_MS = 5 * 60 * 1000 // 5 minutes
if (delay > STOP_THRESHOLD_MS) {
throw error // quota exhaustion — don't retry
}
Documented across production systems:
github.com/SirBrenton/pitstop-truth
Steps to reproduce
Provider/Model
System Information
Any provider returning 429 with retry-after > 300 seconds
(confirmed: anthropic/claude-sonnet-4-6, openai)