research(llm): utility-guided orchestration — multi-signal scoring for tool/routing decisions (arXiv:2603.19896)

## Source

arXiv:2603.19896 — *Utility-Guided Agent Orchestration for Efficient LLM Tool Use* (2026-03-20)

## Technique

An orchestration policy that scores each candidate action (respond, retrieve, tool call, verify, stop) using a utility function weighting:
- **Estimated gain** — expected quality improvement from the action
- **Step cost** — token/latency cost
- **Uncertainty** — model confidence in the current state
- **Redundancy** — overlap with already-retrieved context

The policy selects the action with the highest utility rather than letting the LLM freely choose.

## Applicability to Zeph

**Bandit routing layer (#2415 BaRP)**: The utility function formulation is a natural extension of the BaRP cost-weight dial. Instead of a single `cost_weight` scalar, the bandit reward signal could incorporate all four components: gain (quality), cost, uncertainty, and redundancy. This would make the LinUCB reward more semantically rich.

**ToolExecutor step sequencing**: The `summarize_output` flag and the tool overflow threshold are ad hoc; utility scoring could replace them with a principled "is this tool call worth running?" gate.

**`context_strategy = "adaptive"`**: The adaptive context strategy already attempts a similar multi-signal tradeoff — this paper provides formal grounding.

## Implementation sketch

- Add utility scoring as an optional gate in `zeph-tools/src/composite.rs` before dispatching a tool
- Feed utility signal as an additional feature to the LinUCB bandit reward model
- Config: `[tools] utility_scoring = true`, `utility_gain_weight`, `utility_cost_weight`

## Related

- #2415 — BaRP cost-weight bandit routing (direct extension)
- #2409 — PASTE speculative tool execution (latency angle)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(llm): utility-guided orchestration — multi-signal scoring for tool/routing decisions (arXiv:2603.19896) #2424

Source

Technique

Applicability to Zeph

Implementation sketch

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(llm): utility-guided orchestration — multi-signal scoring for tool/routing decisions (arXiv:2603.19896) #2424

Description

Source

Technique

Applicability to Zeph

Implementation sketch

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions