When using a custom OpenAI-compatible provider behind litellm (Open WebUI 0.8.12 with litellm as the backend), context overflow errors are not detected by isContextOverflow(), so overflow-triggered auto-compaction never fires.
Error message:
Error: 503 litellm.ServiceUnavailableError: litellm.MidStreamFallbackError: litellm.APIConnectionError: APIConnectionError: OpenAIException - Requested token count exceeds the model's maximum context length of 131072 tokens.
What's happening:
The error message contains "exceeds the model's maximum context length" but the existing patterns in @earendil-works/pi-ai/utils/overflow.ts appear to expect different phrasings — e.g. "exceeds the context window" or "maximum context length is … tokens". This particular wording from litellm/OpenAI ("exceeds the model's maximum context length of … tokens") doesn't match any of them. Additionally, the error is wrapped in a litellm ServiceUnavailableError / MidStreamFallbackError / APIConnectionError chain, prefixed with HTTP 503, which could also interfere with pattern matching if any patterns anchor to specific status codes or error prefixes.
Provider setup:
- Open WebUI 0.8.12 (litellm backend),
api: "openai-completions"
- Model:
glm-5.1-llmlb, contextWindow: 131072, maxTokens: 16384, reasoning: true
- pi 0.74.0
pi config:
{
"compaction": {
"enabled": true,
"reserveTokens": 32000,
"keepRecentTokens": 12000
}
}
Model entry in models.json:
{
"id": "glm-5.1-llmlb",
"contextWindow": 131072,
"maxTokens": 16384,
"reasoning": true
}
When using a custom OpenAI-compatible provider behind litellm (Open WebUI 0.8.12 with litellm as the backend), context overflow errors are not detected by
isContextOverflow(), so overflow-triggered auto-compaction never fires.Error message:
What's happening:
The error message contains "exceeds the model's maximum context length" but the existing patterns in
@earendil-works/pi-ai/utils/overflow.tsappear to expect different phrasings — e.g. "exceeds the context window" or "maximum context length is … tokens". This particular wording from litellm/OpenAI ("exceeds the model's maximum context length of … tokens") doesn't match any of them. Additionally, the error is wrapped in a litellmServiceUnavailableError/MidStreamFallbackError/APIConnectionErrorchain, prefixed with HTTP 503, which could also interfere with pattern matching if any patterns anchor to specific status codes or error prefixes.Provider setup:
api: "openai-completions"glm-5.1-llmlb,contextWindow: 131072,maxTokens: 16384,reasoning: truepi config:
{ "compaction": { "enabled": true, "reserveTokens": 32000, "keepRecentTokens": 12000 } }Model entry in
models.json:{ "id": "glm-5.1-llmlb", "contextWindow": 131072, "maxTokens": 16384, "reasoning": true }