feat(model): match-based model routing via model.routes#12234
Conversation
8426ea2 to
67f1f98
Compare
67f1f98 to
bb4a62a
Compare
|
Supersedes #12227 (per-source model selection) with a unified match-based router. |
2 similar comments
|
Supersedes #12227 (per-source model selection) with a unified match-based router. |
|
Supersedes #12227 (per-source model selection) with a unified match-based router. |
bb4a62a to
3017230
Compare
3017230 to
ff32d49
Compare
…er-platform + per-source drafts)
Unify two complementary-but-separate ideas — per-platform model overrides
and per-source-identity model overrides — into one match-based router.
``model.routes`` is a list of ``{match, model, provider, api_key, ...}``
entries; each ``match`` is a subset predicate against a context dict of
``{platform, source_kind, ...}`` built by the caller. First match wins;
no match leaves the base model untouched.
model:
default: my-fast-model
routes:
- match: { source_kind: owner } # owner everywhere
model: my-strong-model
api_key: sk-owner
- match: { platform: hub } # all hub peers
model: my-fast-model
- match: { source_kind: cron }
model: my-fast-model
- match: { platform: discord, source_kind: stranger } # compound
model: some-other
Backwards-compatible shorthand — both forms synthesize ``routes`` entries
internally:
model.platforms.<name>: # existing per-platform shorthand
model.by_source.<kind>: # per-source shorthand
Explicit ``routes`` always evaluate first; legacy shims run last, so an
explicit route always wins over an equivalent legacy entry.
Hook points:
- ``agent.smart_model_routing.apply_route`` — pure helper, normalizes
legacy shorthand, iterates routes, returns ``(model, runtime_kwargs)``.
- ``GatewayRuntime._classify_source_kind(source)`` — classifies
owner / hub_peer / stranger via home-channel match + platform-value
string. No dependency on a specific ``Platform.HUB`` enum member.
- ``GatewayRuntime._build_routing_context(source)`` — assembles the
context dict consumed by ``apply_route``.
- ``GatewayRuntime._resolve_session_agent_runtime`` calls ``apply_route``
as the final layer, below session ``/model`` overrides and above base
config. Silent fallback on exception — never blocks a turn.
- Cron scheduler applies with context ``{platform: cron, source_kind:
cron}`` after runtime resolution and before AIAgent construction.
- HermesCLI applies with context ``{platform: cli, source_kind: owner}``
in ``_resolve_turn_agent_config``.
This supersedes the earlier ``feat/model-routing`` branch that was cut
before upstream NousResearch#12732 wholesale-removed the separate ``smart_model_routing``
cheap-model router. This rewrite drops the obsolete ``resolve_turn_route`` /
``cheap_model`` integration and lands ``apply_route`` as a standalone
feature on a new ``agent/smart_model_routing.py`` file.
Tests: 14 new unit tests covering the matcher (empty context / no routes
/ platform match / source_kind match / compound match / first-match-wins
/ partial override / empty-value rejection / missing context key / null
config) plus legacy shims (platforms string / platforms dict /
by_source). All pass. No regressions in tests/gateway/, tests/cron/,
or tests/agent/ beyond 12 pre-existing upstream-main failures (dingtalk,
matrix, agent_cache) unrelated to this change.
Supersedes NousResearch#7297 (feat/per-platform-model) and the prior draft at NousResearch#12227
(feat/per-source-model). ``model.platforms.*`` configs keep working via
the legacy shim, so deployments on NousResearch#7297 need no migration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ff32d49 to
eb52caa
Compare
…g), drop NousResearch#7297, add NousResearch#12207 - Add rows for NousResearch#12234 (match-based model routing, supersedes NousResearch#7297 and draft NousResearch#12227) and NousResearch#12207 (compound-background-subshell-leak). - Move NousResearch#7297 into a new 'Closed / superseded' section; note the branches (feat/per-platform-model and feat/per-source-model) are already deleted from origin. - Update rebase workflow: swap the feat/per-platform-model line for feat/model-routing, add fix/compound-background-subshell-leak. - Update the fork-main rebuild section: document that octopus strategy fails on adjacent-region additions and switch the documented command to a sequential-merge loop. Note the recurring conflict site (_classify_source_kind vs _is_owner_source) and the union-resolve strategy for it. - Add PR-specific note for NousResearch#12234 covering the helper, classifier, legacy shim, and rebase-conflict guidance.
|
Closing as stale — the code path this addresses no longer exists on Triage notes (high confidence): If this PR's intent is still relevant against the current code, please rebase or open a fresh PR. (Bulk-closed during a CLI PR triage sweep.) |
Summary
A single match-based router for per-turn model/provider selection based on a caller-supplied context. Replaces two previously-separate ideas (per-platform and per-source-identity overrides) with one generic hook.
Example config:
First match wins. No match → base
model:block used. Partial overrides are supported (missing fields inherit base). Empty fields ("",None,[]) do not overwrite — explicit empty is treated as "no opinion" so base values survive.Backwards compatibility
Two shorthand forms are honored via an internal normalizer that synthesizes
routesentries at match time:model.platforms.<name>: "X"{match: {platform: <name>}, model: "X"}model.platforms.<name>: {model, base_url, api_key, ...}{match: {platform: <name>}, ...}model.by_source.<kind>: {model, ...}{match: {source_kind: <kind>}, ...}Explicit
routesalways evaluate first, so any existingmodel.platforms.*config keeps working unchanged; a newroutes:entry can be added alongside and will take precedence.Why unified
Routing by platform and routing by source identity are orthogonal — same function, different predicates. Shipping them as separate hooks (each with its own config key and function call) means
_resolve_session_agent_runtimegrows a new layer every time a new predicate is added, and the config surface grows with it. One matcher with a structured predicate is cleaner, and lets users compose (platform + source_kind) without any code change.Why
source_kindmatters on top ofplatformPer-platform alone can't distinguish the operator from a random stranger on Telegram; per-source alone can't distinguish Hub from Telegram. Both axes together cover the common needs:
How it works
agent.smart_model_routing.apply_route(model, runtime_kwargs, model_config, context)— pure helper. Normalizes legacy shorthand, iterates routes, returns(model, runtime_kwargs).GatewayRuntime._classify_source_kind(source)— classifiesowner(home-channel DM match),hub_peer(platform value\"hub\"— forward-compatible with any enum definition),stranger(everything else).GatewayRuntime._build_routing_context(source)— assembles{platform, source_kind}._resolve_session_agent_runtimecallsapply_routeas the final layer, below session/modeloverrides and above base config. Silent fallback on exception — never blocks a turn.apply_routewith their own hardcoded context ({platform: cron, source_kind: cron}/{platform: cli, source_kind: owner}) beforeresolve_turn_route.The existing message-text-based router (
resolve_turn_route/smart_model_routing.cheap_model) is untouched — it's a peer hook on a different axis. AIAgent construction signature, session keys, agent caching, and sub-agent inheritance are all unchanged.Tests
14 new unit tests for
apply_route:platformsshim (string shorthand + dict)by_sourceshimAll 6 existing
smart_model_routingtests still pass. 20/20 green.Test plan
pytest tests/agent/test_smart_model_routing.py)routesentry swaps both model and base_url in the outbound API request; default model path unchanged for unmatched auxiliary calls.🤖 Generated with Claude Code