Switching to a smaller-context model mid-session causes hard crash instead of graceful truncation

## Bug Description

When a user switches from a large-context model (e.g., GPT 5.4 at 1.05M tokens or Claude Opus at 200K) to a smaller-context model (e.g., Ollama `qwen3.5:122b-a10b` at 128K) mid-session, the gateway crashes/locks up if the accumulated session context exceeds the new model's context window.

The user is forced to manually restart the gateway and reset the session, losing all in-flight context.

## Steps to Reproduce

1. Start a session with a large-context model (e.g., `openai-codex/gpt-5.4` with 1.05M context)
2. Accumulate significant context (tool calls, file reads, conversation history — typical working session of 30+ minutes)
3. Switch to a smaller-context model via `/model qwen` (128K context) or similar
4. Gateway becomes unresponsive — no error message returned to user, no graceful degradation

## Expected Behavior

When switching to a model with a smaller context window, OpenClaw should:

1. **Detect** that current session context exceeds the target model's context window *before* sending the request
2. **Automatically truncate/compact** the context to fit the new model's window (emergency compaction)
3. **Warn the user** that context was truncated to accommodate the smaller model (e.g., "⚠️ Switching to qwen (128K context). Session context was 180K tokens — older messages have been compacted to fit.")
4. **Continue the session** without requiring a manual restart or reset

If truncation is not possible or would lose too much context, the switch should **fail gracefully** with a clear error message and keep the current model active.

## Actual Behavior

- Gateway locks up entirely (no response to user, Telegram polling stalls)
- Gateway log shows: `Unsubscribed during compaction` and `Gateway is draining for restart; new tasks are not accepted`
- User must manually restart the gateway and reset the session
- All in-flight context is lost
- Any background processes (bots, cron jobs) may be disrupted by the forced restart

## Evidence from Logs

```
2026-03-12T12:28:41.474-04:00 [telegram] Restarting polling after unhandled network error: Unsubscribed during compaction
2026-03-12T12:28:41.539-04:00 Followup agent failed before reply: Gateway is draining for restart; new tasks are not accepted
```

A prior occurrence (Mar 10) showed a cleaner error but still required manual intervention:
```
2026-03-10T14:19:50.377-04:00 [agent/embedded] embedded run agent end: isError=true error=Context overflow: prompt too large for the model.
```

## Related Issues

- #17034 — Memory flush `softThresholdTokens` doesn't scale with context window size (same family of context-window-mismatch issues, but focused on compaction thresholds rather than model switching)
- #9409 — Improve context overflow error message
- #41594 — Per-model context profiles for workspace injection and tool visibility
- #12739 — `/status` should use per-model contextWindow

## Suggested Implementation

1. **Pre-flight check on `/model` switch:** Before applying a model change, compare `currentContextTokens` against `newModel.contextWindow`. If it exceeds the new window, trigger an emergency compaction pass first.
2. **Emergency compaction for model switch:** A targeted compaction that aggressively summarizes older turns to fit within the new model's budget, preserving the most recent context and system prompt.
3. **Graceful fallback:** If emergency compaction still can't fit (e.g., system prompt + workspace files alone exceed the new model's window), reject the switch with a clear error: "Cannot switch to [model] — system prompt and workspace context alone require [X]K tokens, which exceeds [model]'s [Y]K context window."
4. **User warning on success:** When context is successfully truncated, inform the user what was lost so they can re-establish context if needed.

## Environment

- OpenClaw v2026.3.11 (29dc654)
- Source model: `openai-codex/gpt-5.4` (1.05M context)
- Target model: `ollama/qwen3.5:122b-a10b` (128K context)
- Channel: Telegram
- Host: macOS (Darwin arm64)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Switching to a smaller-context model mid-session causes hard crash instead of graceful truncation #44303

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Evidence from Logs

Related Issues

Suggested Implementation

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Switching to a smaller-context model mid-session causes hard crash instead of graceful truncation #44303

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Evidence from Logs

Related Issues

Suggested Implementation

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions