Prompt caching is broken for orchestrator calls resulting in repeatedly high TTFT

## Summary

- What broke: Prompt caching is broken for orchestrator agent calls — the system prompt and workflow context sections are embedded in the `prompt` string and change between turns, invalidating the cache prefix on every message. This causes full context rebuild on every turn even when using `resumeSessionId`.
- When it started (if known): Unknown — likely always been this way.
- Severity: `major` — functional correctness is unaffected; performance degrades significantly from repeated cache misses (higher latency and token cost).

## Steps to Reproduce

1. Start Archon server with at least one registered codebase and one workflow
2. Send a message to a conversation (e.g., "hello")
3. Check Anthropic/llama.cpp usage stats — observe full context rebuild on every turn
4. Send another message in the same conversation (session is resumed via `resumeSessionId`)
5. Check usage stats again — observe `cache_creation_input_tokens` is high on every turn, `cache_read_input_tokens` is 0 or minimal

## Expected vs Actual

- **Expected**: The stable prefix of the prompt (system preset + routing rules + project list) is cached on the first turn. Subsequent turns within the same session reuse that cache — only the new user message portion is processed, with `cache_read_input_tokens >> cache_creation_input_tokens`.
- **Actual**: Every turn produces a fresh prompt string with different content (conditional workflow context, dynamic project list, variable message content before the cache breakpoint), causing full context rebuild on every turn.

## User Flow

```
User                   Archon                   AI Client (Claude/llama.cpp)
────                   ──────                   ────────────────────────────
sends message ────────▶ buildFullPrompt()
                        ├─ loads codebases from DB
                        ├─ discovers workflows from disk
                        ├─ fetches recent workflow results
                        ├─ assembles single prompt string
                        │   [X] system prompt varies by codebase count
                        │   [X] workflowContextSuffix appears/disappears
                        │   [X] threadContext changes every turn
                        │   [X] issueContext/fileSuffix conditional
                        │
                        sends prompt + resumeSessionId
                                              ──▶ KV cache: partial or no hit
                                              ──▶ full context rebuild
                                              ──▶ full inference
                         streams response ◀─────
sees response ◀────────
```

## Environment

- Platform: All (Slack / Telegram / GitHub / Discord / Web / CLI) — affects orchestrator path only
- Database: SQLite / PostgreSQL (both affected)
- Running in worktree? Yes / No (both affected)
- OS: All

## Logs

```
llama.cpp KV cache behavior showing full context rebuild on every turn:

2026-05-05 15:21:36.138 [Info] slot update_slots: task 14335 | new prompt, n_ctx_slot = 262144, n_keep = 8192, task.n_tokens = 52755
2026-05-05 15:21:36.138 [Info] slot update_slots: task 14335 | n_past = 17792, slot.prompt.tokens.size() = 46384
2026-05-05 15:21:36.139 [Info] slot update_slots: task 14335 | Checking checkpoint with [46333, 46333] against 17792...
2026-05-05 15:21:36.139 [Info] slot update_slots: task 14335 | Checking checkpoint with [45821, 45821] against 17792...
2026-05-05 15:21:36.139 [Info] slot update_slots: task 14335 | Checking checkpoint with [40959, 40959] against 17792...
2026-05-05 15:21:36.139 [Info] slot update_slots: task 14335 | Checking checkpoint with [32767, 32767] against 17792...
2026-05-05 15:21:36.139 [Info] slot update_slots: task 14335 | Checking checkpoint with [24575, 24575] against 17792...
2026-05-05 15:21:36.139 [Info] slot update_slots: task 14335 | Checking checkpoint with [16383, 16383] against 17792...
2026-05-05 15:21:36.149 [Info] slot update_slots: task 14335 | restored context checkpoint (pos_min = 16383, pos_max = 16383, n_tokens = 16384, n_past = 16384, size = 62.813 MiB)
2026-05-05 15:21:36.151 [Info] slot update_slots: task 14335 | erased invalidated context checkpoint (pos_min = 24575, ...)
2026-05-05 15:21:36.153 [Info] slot update_slots: task 14335 | erased invalidated context checkpoint (pos_min = 32767, ...)
2026-05-05 15:21:36.154 [Info] slot update_slots: task 14335 | erased invalidated context checkpoint (pos_min = 40959, ...)
2026-05-05 15:21:36.155 [Info] slot update_slots: task 14335 | erased invalidated context checkpoint (pos_min = 45821, ...)
2026-05-05 15:21:36.156 [Info] slot update_slots: task 14335 | erased invalidated context checkpoint (pos_min = 46333, ...)
2026-05-05 15:21:36.157 [Info] slot update_slots: task 14335 | n_tokens = 16384, memory_seq_rm [16384, end)

Breakdown:
  - task.n_tokens = 52,755  → total prompt size
  - n_past = 17,792         → KV cache had 17,792 tokens from previous turn
  - slot.prompt.tokens.size() = 46,384  → new prompt prefix (52,755 - 6,371 user message)
  - Only checkpoint at [16383, 16383] matches (16383 < 17792)
  - All larger checkpoints erased → 48,000+ tokens of context rebuilt from scratch
  - Cache hit ratio: ~34% (16,384 / 46,384), rest is full rebuild
```

## Impact

- Affected workflows/commands: Orchestrator agent path (`handleMessage` → `aiClient.sendQuery`) — all AI-powered conversations. CLI workflow DAG nodes are not affected (each node gets a fresh session).
- Reproduction rate: Always
- Workaround available? No — users pay full token cost on every turn with no cache reuse.
- Data loss risk? No

## Scope

- Package(s) likely involved: `core`
- Module: `orchestrator:prompt-builder`, `orchestrator:orchestrator-agent`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt caching is broken for orchestrator calls resulting in repeatedly high TTFT #1591

Summary

Steps to Reproduce

Expected vs Actual

User Flow

Environment

Logs

Impact

Scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Prompt caching is broken for orchestrator calls resulting in repeatedly high TTFT #1591

Description

Summary

Steps to Reproduce

Expected vs Actual

User Flow

Environment

Logs

Impact

Scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions