feat: compaction model fallbacks when primary model is unreachable

## Problem

When compaction.model is set to a local/self-hosted model (e.g. vllm/qwen3-coder-next), compaction fails immediately if that model server is unreachable. There is no fallback mechanism — compaction returns ok: false and the session continues to grow until it hits overflow recovery.

This is especially common when:
- A local model (vLLM, llamacpp, Ollama) is the primary compaction model for cost savings
- The GPU server is on a separate machine that may go down
- Hosted models are configured as fallbacks for regular agent turns but compaction has no equivalent

## Proposed Solution

Add a modelFallbacks array to AgentCompactionConfig. When resolveModelAsync fails for the primary compaction model, iterate through modelFallbacks in order until one resolves successfully.

Config example:
  compaction.model: vllm/qwen3-coder-next
  compaction.modelFallbacks: [anthropic/claude-haiku-4-5]

## Impact

- Prevents compaction failures when local model servers are temporarily unavailable
- Maintains cost optimization (try free local model first, fall back to cheap hosted)
- No behavior change when modelFallbacks is unset (backward compatible)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: compaction model fallbacks when primary model is unreachable #52011

Problem

Proposed Solution

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

feat: compaction model fallbacks when primary model is unreachable #52011

Description

Problem

Proposed Solution

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions