Conversation
This was referenced Apr 6, 2026
…ing (#1841, #2165) Add two research-driven routing improvements to zeph-llm: 1. Agent Stability Index (ASI) — per-provider sliding window of response embeddings. Coherence score (cosine sim vs window mean) penalizes Thompson beta and EMA score when responses drift. Fires tracing::warn on ASI drift below threshold. Config: [llm.routing.asi] enabled/window/coherence_threshold/penalty_weight. 2. Quality-gate cascading — after Thompson/EMA selection, optionally compute cosine similarity between query and response embeddings. Below threshold, escalate to next provider; on pool exhaustion, return best response seen. Fail-open on embed errors. Config: [llm.routing] quality_gate = 0.75 (optional). Bounds validation added for quality_gate (0.0, 1.0] and ASI params with warn log on invalid values. Quality fallback logs at info level (not warn).
1b4cecd to
1feb267
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
crates/zeph-llm/src/router/asi.rsnew): per-provider sliding window of response embeddings; coherence score (cosine similarity vs window mean) penalizes Thompson beta and EMA score when responses drift; emitstracing::warn!on ASI drift below configurable thresholdrouter/mod.rs): after Thompson/EMA selection, optionally compute cosine similarity between query and response embeddings; below threshold escalates to next provider in pool; on exhaustion returns best response seen; fail-open on embed errorsquality_gate(must be in(0.0, 1.0]) and ASIcoherence_threshold/penalty_weight(clamped with warn log)Config
Test plan
cargo nextest run --workspace --lib --bins— 7690/7690 passcargo clippy --workspace -- -D warnings— cleancargo +nightly fmt --check— cleanquality_gateenabled — verify escalation logged atinfowindow=3— verify drift detection firesCloses #1841, #2165