Source
arXiv:2508.21141 — "Adaptive LLM Routing under Budget Constraints" (EMNLP 2025)
Summary
Frames LLM routing as a contextual bandit problem using a shared query-model embedding space and LinUCB-based PILOT algorithm. Adapts online to observed quality feedback without requiring pre-labelled model-query pairs. Includes a multi-choice knapsack cost policy for per-request token budget enforcement.
Applicability to Zeph
HIGH — zeph-llm router (triage/thompson/cascade strategies).
Current Zeph routing is static (triage = rule-based) or heuristic (thompson = reputation sampling). PILOT's online bandit formulation would:
- Adapt routing based on actual quality outcomes observed during use
- Handle budget constraints natively (maps to
[cost] max_daily_cents)
- No retraining required — updates from each inference call
Implementation Direction
- New
LlmRoutingStrategy::Bandit variant alongside existing Thompson, Triage, Cascade
- Shared query-model embedding for provider selection
- Online updates via
[llm.router.reputation] infrastructure (already exists)
Priority: P2 — high-impact improvement to routing quality and cost efficiency
Discovered: CI-211 research scan (2026-03-27)
Source
arXiv:2508.21141 — "Adaptive LLM Routing under Budget Constraints" (EMNLP 2025)
Summary
Frames LLM routing as a contextual bandit problem using a shared query-model embedding space and LinUCB-based PILOT algorithm. Adapts online to observed quality feedback without requiring pre-labelled model-query pairs. Includes a multi-choice knapsack cost policy for per-request token budget enforcement.
Applicability to Zeph
HIGH —
zeph-llmrouter (triage/thompson/cascade strategies).Current Zeph routing is static (triage = rule-based) or heuristic (thompson = reputation sampling). PILOT's online bandit formulation would:
[cost] max_daily_cents)Implementation Direction
LlmRoutingStrategy::Banditvariant alongside existingThompson,Triage,Cascade[llm.router.reputation]infrastructure (already exists)Priority: P2 — high-impact improvement to routing quality and cost efficiency
Discovered: CI-211 research scan (2026-03-27)