Bug Description
When a user switches from a large-context model (e.g., GPT 5.4 at 1.05M tokens or Claude Opus at 200K) to a smaller-context model (e.g., Ollama qwen3.5:122b-a10b at 128K) mid-session, the gateway crashes/locks up if the accumulated session context exceeds the new model's context window.
The user is forced to manually restart the gateway and reset the session, losing all in-flight context.
Steps to Reproduce
- Start a session with a large-context model (e.g.,
openai-codex/gpt-5.4 with 1.05M context)
- Accumulate significant context (tool calls, file reads, conversation history — typical working session of 30+ minutes)
- Switch to a smaller-context model via
/model qwen (128K context) or similar
- Gateway becomes unresponsive — no error message returned to user, no graceful degradation
Expected Behavior
When switching to a model with a smaller context window, OpenClaw should:
- Detect that current session context exceeds the target model's context window before sending the request
- Automatically truncate/compact the context to fit the new model's window (emergency compaction)
- Warn the user that context was truncated to accommodate the smaller model (e.g., "⚠️ Switching to qwen (128K context). Session context was 180K tokens — older messages have been compacted to fit.")
- Continue the session without requiring a manual restart or reset
If truncation is not possible or would lose too much context, the switch should fail gracefully with a clear error message and keep the current model active.
Actual Behavior
- Gateway locks up entirely (no response to user, Telegram polling stalls)
- Gateway log shows:
Unsubscribed during compaction and Gateway is draining for restart; new tasks are not accepted
- User must manually restart the gateway and reset the session
- All in-flight context is lost
- Any background processes (bots, cron jobs) may be disrupted by the forced restart
Evidence from Logs
2026-03-12T12:28:41.474-04:00 [telegram] Restarting polling after unhandled network error: Unsubscribed during compaction
2026-03-12T12:28:41.539-04:00 Followup agent failed before reply: Gateway is draining for restart; new tasks are not accepted
A prior occurrence (Mar 10) showed a cleaner error but still required manual intervention:
2026-03-10T14:19:50.377-04:00 [agent/embedded] embedded run agent end: isError=true error=Context overflow: prompt too large for the model.
Related Issues
Suggested Implementation
- Pre-flight check on
/model switch: Before applying a model change, compare currentContextTokens against newModel.contextWindow. If it exceeds the new window, trigger an emergency compaction pass first.
- Emergency compaction for model switch: A targeted compaction that aggressively summarizes older turns to fit within the new model's budget, preserving the most recent context and system prompt.
- Graceful fallback: If emergency compaction still can't fit (e.g., system prompt + workspace files alone exceed the new model's window), reject the switch with a clear error: "Cannot switch to [model] — system prompt and workspace context alone require [X]K tokens, which exceeds [model]'s [Y]K context window."
- User warning on success: When context is successfully truncated, inform the user what was lost so they can re-establish context if needed.
Environment
- OpenClaw v2026.3.11 (29dc654)
- Source model:
openai-codex/gpt-5.4 (1.05M context)
- Target model:
ollama/qwen3.5:122b-a10b (128K context)
- Channel: Telegram
- Host: macOS (Darwin arm64)
Bug Description
When a user switches from a large-context model (e.g., GPT 5.4 at 1.05M tokens or Claude Opus at 200K) to a smaller-context model (e.g., Ollama
qwen3.5:122b-a10bat 128K) mid-session, the gateway crashes/locks up if the accumulated session context exceeds the new model's context window.The user is forced to manually restart the gateway and reset the session, losing all in-flight context.
Steps to Reproduce
openai-codex/gpt-5.4with 1.05M context)/model qwen(128K context) or similarExpected Behavior
When switching to a model with a smaller context window, OpenClaw should:
If truncation is not possible or would lose too much context, the switch should fail gracefully with a clear error message and keep the current model active.
Actual Behavior
Unsubscribed during compactionandGateway is draining for restart; new tasks are not acceptedEvidence from Logs
A prior occurrence (Mar 10) showed a cleaner error but still required manual intervention:
Related Issues
softThresholdTokensdoesn't scale with context window size (same family of context-window-mismatch issues, but focused on compaction thresholds rather than model switching)/statusshould use per-model contextWindowSuggested Implementation
/modelswitch: Before applying a model change, comparecurrentContextTokensagainstnewModel.contextWindow. If it exceeds the new window, trigger an emergency compaction pass first.Environment
openai-codex/gpt-5.4(1.05M context)ollama/qwen3.5:122b-a10b(128K context)