Bug Description
When the context compression runs with summary_model_override=None (the default), and the Groq API returns a 413 TPM error, the code does NOT retry on the main model (Qwen3 local). It just waits for cooldown and returns None.
This happens because in context_compressor.py, line ~939:
if (self.summary_model
and self.summary_model != self.model):
When summary_model_override=None is set, this becomes an empty string which is falsy. The condition short-circuits: never enters the retry block for main model.
So when Groq hits 413 (TPM exhausted), there's no fallback to Qwen3 local.
Root Cause
The code assumes a truthy summary_model name, but None becomes empty string which is falsy. The condition should also check if summary_model_override was never set (is not None) and retry on main model in that case.
Evidence (from logs)
ContextCompressor summary failed with HTTP 413: Requested tokens exceed the maximum tokens per minute for this model.
ContextCompressor cooldown: 60s before next retry attempt
ContextCompressor no more retries, returning None (compressed 0 messages)
No Qwen3 fallback appears between Groq retries — confirms condition is False when summary_model="".
Proposed Fix
The fix should check self.summary_model_override is not None separately:
if self.summary_model or True: # always enter retry block
if (not self.summary_model and self.model) or (self.summary_model != self.model):
# Retry on main model when no override set, OR different models configured
Impact
- Groq free tier users with
summary_model_override=None get zero compression once TPM is exhausted
- Context keeps growing until session restarts
Bug Description
When the context compression runs with
summary_model_override=None(the default), and the Groq API returns a 413 TPM error, the code does NOT retry on the main model (Qwen3 local). It just waits for cooldown and returns None.This happens because in
context_compressor.py, line ~939:When
summary_model_override=Noneis set, this becomes an empty string which is falsy. The condition short-circuits: never enters the retry block for main model.So when Groq hits 413 (TPM exhausted), there's no fallback to Qwen3 local.
Root Cause
The code assumes a truthy
summary_modelname, butNonebecomes empty string which is falsy. The condition should also check if summary_model_override was never set (is not None) and retry on main model in that case.Evidence (from logs)
No Qwen3 fallback appears between Groq retries — confirms condition is False when
summary_model="".Proposed Fix
The fix should check
self.summary_model_override is not Noneseparately:Impact
summary_model_override=Noneget zero compression once TPM is exhausted