feat(orchestration,llm): AdaptOrch topology advisor + CoE entropy routing#3099
Merged
feat(orchestration,llm): AdaptOrch topology advisor + CoE entropy routing#3099
Conversation
…router AdaptOrch (#2434): 16-arm Thompson Beta-bandit classifies goals into four TaskClass variants and samples a TopologyHint (Sequential/Parallel/Cascade/ Adaptive) injected into the planner prompt. Outcomes recorded synchronously; arm table persists to disk on graceful shutdown. Enabled via [orchestration.adaptorch] config block. CoE (#2505): per-call ChatExtras { entropy } returned by chat_with_extras() for OpenAI, Compatible, and Ollama providers via logprobs. Router applies intra-entropy threshold and inter-divergence (1-cosine)/2 to escalate uncertain primary responses to a secondary provider. Gated to Ema and Thompson routing strategies. Configured via [llm.coe] block.
H1: fix multibyte panic in adaptorch classify() — use chars().take(400)
H2: eliminate double LLM call in CoE — run_coe now accepts pre-obtained
primary text + ChatExtras; router/mod.rs switches to chat_with_extras()
H3: remove unimplemented max_secondary_calls_per_turn field from CoeConfig
and CoeConfig TOML (pre-v1.0 clean removal)
H4: fix plan cache bypassed for Hybrid hint — use_cache now checks
prompt_sentence().is_none() so Hybrid falls through to cache
S-M-2: replace inner unwrap() in BetaDist::sample with expect() + comment
P1: add 10s timeout on embed calls in inter_divergence()
P2: fix double lock in sample_arm — clone arms under lock, then acquire rng
G1: add async tests for recommend() — valid JSON path and timeout fallback
G2: add sample_arm test verifying reinforced arm is preferred
G5: add three async run_coe tests — keep_primary, secondary failure, intra
escalation
…d tokio_test dependency
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
zeph-orchestration): 16-arm Thompson Beta-bandit classifies goals intoTaskClassvariants and samples aTopologyHint(Sequential / Parallel / Cascade / Adaptive) injected into the planner prompt before planning. State persists to disk on graceful shutdown. Zero overhead when[orchestration.adaptorch] enabled = false.zeph-llm/router):chat_with_extras()returnsChatExtras { entropy }for providers with logprobs support (OpenAI, Compatible, Ollama). Router uses intra-entropy threshold and inter-divergence(1-cosine)/2to escalate uncertain responses to a configured secondary provider. Gated toEma/Thompsonstrategies only. Zero overhead when[llm.coe] enabled = false.Closes #2434, #2505.
Changes
crates/zeph-orchestration/src/adaptorch.rs—TopologyAdvisor,TaskClass,TopologyHint, Thompson arm statecrates/zeph-llm/src/router/coe.rs—CoeRouter,ChatExtras,run_coe, inter-divergence computationcrates/zeph-llm/src/provider.rs—chat_with_extras()trait method + per-provider implscrates/zeph-orchestration/src/plan.rs—plan_with_hint()planner integrationsrc/runner.rs— bootstrap wiring for both features[orchestration.adaptorch]and[llm.coe]blocks (bothenabled = falseby default)Test plan
cargo nextest run --workspace --lib --bins— 8122 tests passcargo +nightly fmt --check— cleancargo clippy --workspace -- -D warnings— cleancargo run --features full -- --config .local/config/testing.tomland verify at least one multi-turn round-trip completes without 400/422 errors.local/testing/playbooks/orchestration.md— scenarios for AdaptOrch hint injection and CoE escalationcoverage-status.md(status: Untested — pending live session)