feat: /model auto — heuristic V4-Flash vs V4-Pro routing per turn

## Problem

DeepSeek-V4 ships in two tiers:

- **V4-Pro** — flagship; better at hard reasoning, costs more per token.
- **V4-Flash** — fast, cheap; great for simple tool calls and follow-ups.

Today users pick one at session start (`/model deepseek-v4-pro` or `/model deepseek-v4-flash`) and stay there. That's wasteful: easy turns (a one-line file read, a `git status`) burn Pro tokens unnecessarily; hard turns (multi-file refactor) on Flash get worse-quality output.

## Proposed solution

Add `auto` as a **first-class model option** alongside the explicit IDs. Same UX surface as today — users pick from the model menu — with a third entry:

```
/model deepseek-v4-pro      → always Pro
/model deepseek-v4-flash    → always Flash
/model auto                 → router picks per turn        ← new
```

The model picker (`/model` with no argument) lists `auto` first.

## Routing heuristic (when `auto` is selected)

The dispatcher classifies each turn before sending and picks a tier:

| Signal | Lean toward Flash | Lean toward Pro |
| --- | --- | --- |
| Recent message length | short | long |
| Tool result count in context | many simple results | few complex results |
| User prompt complexity (keyword density: "design", "architect", "refactor", "audit", "analyze") | absent | present |
| Last turn's reasoning effort | off / low | high / max |
| User's `/effort` setting | low / off | high / max |

Threshold-tuned at first; later, a small classifier trained on session telemetry.

User can override mid-session with `/model deepseek-v4-pro` (sticks until reset to `/model auto`).

## Footer

The footer model chip shows `auto·flash` or `auto·pro` so the user always knows which tier the current turn ran on.

## Telemetry

`/cost` extends with a per-tier breakdown when `auto` is in use:

```
this session: 12 turns Pro ($0.45), 28 turns Flash ($0.03), saved ~$0.20 vs Pro-only
```

## Why this matters

Cost is the #1 reason users churn off paid CLIs. Smart routing turns "I burned $5 today" into "I burned $0.80 today" without hurting quality on the turns that matter. That's a real differentiator vs Claude Code (no price-tier-down model) and vs Cursor (closed routing).

Folding it into `/model` (vs a separate `/route` command) keeps the UI surface flat — there's exactly one knob for "what model".

## Acceptance criteria

- [ ] `/model auto` registers as a valid model option in the picker.
- [ ] Footer chip shows `auto·flash` or `auto·pro` when auto is active.
- [ ] `/cost` per-tier breakdown when auto is in use.
- [ ] Heuristic threshold table documented; users can override via `~/.deepseek/config.toml`.
- [ ] At least one A/B run on a sample task showing Pro-only vs auto cost difference.
- [ ] Existing `/model deepseek-v4-pro` / `/model deepseek-v4-flash` continue to work (no regression).

## Related

- `crates/tui/src/commands/core.rs:model` — slash command.
- `crates/tui/src/llm_client.rs` — request dispatch.
- `crates/tui/src/config.rs:normalize_model_name` — already accepts model strings; `auto` becomes a recognized value.
- DeepSeek pricing: https://api-docs.deepseek.com/quick_start/pricing
- `/effort` already exists for thinking budget; this is the model-tier analogue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: /model auto — heuristic V4-Flash vs V4-Pro routing per turn #315

Problem

Proposed solution

Routing heuristic (when `auto` is selected)

Footer

Telemetry

Why this matters

Acceptance criteria

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Signal	Lean toward Flash	Lean toward Pro
Recent message length	short	long
Tool result count in context	many simple results	few complex results
User prompt complexity (keyword density: "design", "architect", "refactor", "audit", "analyze")	absent	present
Last turn's reasoning effort	off / low	high / max
User's `/effort` setting	low / off	high / max

feat: /model auto — heuristic V4-Flash vs V4-Pro routing per turn #315

Description

Problem

Proposed solution

Routing heuristic (when auto is selected)

Footer

Telemetry

Why this matters

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Routing heuristic (when `auto` is selected)