Skip to content

feat(model): match-based model routing via model.routes#12234

Closed
handsdiff wants to merge 1 commit into
NousResearch:mainfrom
handsdiff:feat/model-routing
Closed

feat(model): match-based model routing via model.routes#12234
handsdiff wants to merge 1 commit into
NousResearch:mainfrom
handsdiff:feat/model-routing

Conversation

@handsdiff

Copy link
Copy Markdown
Contributor

Summary

A single match-based router for per-turn model/provider selection based on a caller-supplied context. Replaces two previously-separate ideas (per-platform and per-source-identity overrides) with one generic hook.

Example config:

model:
  default: my-fast-model
  routes:
    - match: { source_kind: owner }         # owner everywhere
      model: my-strong-model
      api_key: sk-owner
    - match: { platform: hub }              # all Hub peers
      model: my-fast-model
    - match: { source_kind: cron }
      model: my-fast-model
    - match: { platform: discord, source_kind: stranger }   # compound
      model: some-other

First match wins. No match → base model: block used. Partial overrides are supported (missing fields inherit base). Empty fields ("", None, []) do not overwrite — explicit empty is treated as "no opinion" so base values survive.

Backwards compatibility

Two shorthand forms are honored via an internal normalizer that synthesizes routes entries at match time:

Shorthand Synthesized route
model.platforms.<name>: "X" {match: {platform: <name>}, model: "X"}
model.platforms.<name>: {model, base_url, api_key, ...} {match: {platform: <name>}, ...}
model.by_source.<kind>: {model, ...} {match: {source_kind: <kind>}, ...}

Explicit routes always evaluate first, so any existing model.platforms.* config keeps working unchanged; a new routes: entry can be added alongside and will take precedence.

Why unified

Routing by platform and routing by source identity are orthogonal — same function, different predicates. Shipping them as separate hooks (each with its own config key and function call) means _resolve_session_agent_runtime grows a new layer every time a new predicate is added, and the config surface grows with it. One matcher with a structured predicate is cleaner, and lets users compose (platform + source_kind) without any code change.

Why source_kind matters on top of platform

Per-platform alone can't distinguish the operator from a random stranger on Telegram; per-source alone can't distinguish Hub from Telegram. Both axes together cover the common needs:

  • "Owner on any channel gets the strong model."
  • "Agent-to-agent Hub traffic uses the cheap model."
  • "Strangers on Discord get isolated to a compliance-friendly provider."

How it works

  • New agent.smart_model_routing.apply_route(model, runtime_kwargs, model_config, context) — pure helper. Normalizes legacy shorthand, iterates routes, returns (model, runtime_kwargs).
  • New GatewayRuntime._classify_source_kind(source) — classifies owner (home-channel DM match), hub_peer (platform value \"hub\" — forward-compatible with any enum definition), stranger (everything else).
  • New GatewayRuntime._build_routing_context(source) — assembles {platform, source_kind}.
  • Hook point: _resolve_session_agent_runtime calls apply_route as the final layer, below session /model overrides and above base config. Silent fallback on exception — never blocks a turn.
  • Cron scheduler and HermesCLI apply apply_route with their own hardcoded context ({platform: cron, source_kind: cron} / {platform: cli, source_kind: owner}) before resolve_turn_route.

The existing message-text-based router (resolve_turn_route / smart_model_routing.cheap_model) is untouched — it's a peer hook on a different axis. AIAgent construction signature, session keys, agent caching, and sub-agent inheritance are all unchanged.

Tests

14 new unit tests for apply_route:

  • empty context, no routes, null config (all no-op)
  • platform-only match, source_kind-only match, compound match
  • first-match-wins ordering
  • partial overrides (base fields preserved)
  • empty fields don't overwrite
  • missing context key → no match
  • legacy platforms shim (string shorthand + dict)
  • legacy by_source shim
  • explicit routes beat legacy shims on the same predicate

All 6 existing smart_model_routing tests still pass. 20/20 green.

Test plan

  • Unit: 20/20 pass (pytest tests/agent/test_smart_model_routing.py)
  • Live: verified end-to-end on a test VM — CLI turn with a matching routes entry swaps both model and base_url in the outbound API request; default model path unchanged for unmatched auxiliary calls.
  • Parse-check on all modified files.

🤖 Generated with Claude Code

@handsdiff handsdiff force-pushed the feat/model-routing branch from 8426ea2 to 67f1f98 Compare April 19, 2026 22:11
@handsdiff handsdiff force-pushed the feat/model-routing branch from 67f1f98 to bb4a62a Compare April 21, 2026 19:06
@alt-glitch alt-glitch added type/feature New feature or request comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/cron Cron scheduler and job management labels Apr 21, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #12227 (per-source model selection) with a unified match-based router.

2 similar comments
@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #12227 (per-source model selection) with a unified match-based router.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #12227 (per-source model selection) with a unified match-based router.

…er-platform + per-source drafts)

Unify two complementary-but-separate ideas — per-platform model overrides
and per-source-identity model overrides — into one match-based router.
``model.routes`` is a list of ``{match, model, provider, api_key, ...}``
entries; each ``match`` is a subset predicate against a context dict of
``{platform, source_kind, ...}`` built by the caller. First match wins;
no match leaves the base model untouched.

    model:
      default: my-fast-model
      routes:
        - match: { source_kind: owner }        # owner everywhere
          model: my-strong-model
          api_key: sk-owner
        - match: { platform: hub }             # all hub peers
          model: my-fast-model
        - match: { source_kind: cron }
          model: my-fast-model
        - match: { platform: discord, source_kind: stranger }  # compound
          model: some-other

Backwards-compatible shorthand — both forms synthesize ``routes`` entries
internally:

    model.platforms.<name>:       # existing per-platform shorthand
    model.by_source.<kind>:       # per-source shorthand

Explicit ``routes`` always evaluate first; legacy shims run last, so an
explicit route always wins over an equivalent legacy entry.

Hook points:

- ``agent.smart_model_routing.apply_route`` — pure helper, normalizes
  legacy shorthand, iterates routes, returns ``(model, runtime_kwargs)``.
- ``GatewayRuntime._classify_source_kind(source)`` — classifies
  owner / hub_peer / stranger via home-channel match + platform-value
  string. No dependency on a specific ``Platform.HUB`` enum member.
- ``GatewayRuntime._build_routing_context(source)`` — assembles the
  context dict consumed by ``apply_route``.
- ``GatewayRuntime._resolve_session_agent_runtime`` calls ``apply_route``
  as the final layer, below session ``/model`` overrides and above base
  config. Silent fallback on exception — never blocks a turn.
- Cron scheduler applies with context ``{platform: cron, source_kind:
  cron}`` after runtime resolution and before AIAgent construction.
- HermesCLI applies with context ``{platform: cli, source_kind: owner}``
  in ``_resolve_turn_agent_config``.

This supersedes the earlier ``feat/model-routing`` branch that was cut
before upstream NousResearch#12732 wholesale-removed the separate ``smart_model_routing``
cheap-model router. This rewrite drops the obsolete ``resolve_turn_route`` /
``cheap_model`` integration and lands ``apply_route`` as a standalone
feature on a new ``agent/smart_model_routing.py`` file.

Tests: 14 new unit tests covering the matcher (empty context / no routes
/ platform match / source_kind match / compound match / first-match-wins
/ partial override / empty-value rejection / missing context key / null
config) plus legacy shims (platforms string / platforms dict /
by_source). All pass. No regressions in tests/gateway/, tests/cron/,
or tests/agent/ beyond 12 pre-existing upstream-main failures (dingtalk,
matrix, agent_cache) unrelated to this change.

Supersedes NousResearch#7297 (feat/per-platform-model) and the prior draft at NousResearch#12227
(feat/per-source-model). ``model.platforms.*`` configs keep working via
the legacy shim, so deployments on NousResearch#7297 need no migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@handsdiff handsdiff force-pushed the feat/model-routing branch from ff32d49 to eb52caa Compare April 24, 2026 03:26
handsdiff added a commit to handsdiff/hermes-agent that referenced this pull request Apr 24, 2026
…g), drop NousResearch#7297, add NousResearch#12207

- Add rows for NousResearch#12234 (match-based model routing, supersedes NousResearch#7297 and draft NousResearch#12227)
  and NousResearch#12207 (compound-background-subshell-leak).
- Move NousResearch#7297 into a new 'Closed / superseded' section; note the branches
  (feat/per-platform-model and feat/per-source-model) are already deleted from origin.
- Update rebase workflow: swap the feat/per-platform-model line for feat/model-routing,
  add fix/compound-background-subshell-leak.
- Update the fork-main rebuild section: document that octopus strategy fails on
  adjacent-region additions and switch the documented command to a sequential-merge
  loop. Note the recurring conflict site (_classify_source_kind vs _is_owner_source)
  and the union-resolve strategy for it.
- Add PR-specific note for NousResearch#12234 covering the helper, classifier, legacy shim, and
  rebase-conflict guidance.
@teknium1

Copy link
Copy Markdown
Contributor

Closing as stale — the code path this addresses no longer exists on main.

Triage notes (high confidence):
PR adds agent/smart_model_routing.py with model.routes/by_source/source_kind wiring. Merged PR #12732 (2026-04-20, 'refactor: remove smart_model_routing feature') explicitly removed this feature from main; reintroducing it contradicts the deliberate refactor — no smart_model_routing references remain on origin/main.

If this PR's intent is still relevant against the current code, please rebase or open a fresh PR.

(Bulk-closed during a CLI PR triage sweep.)

@teknium1 teknium1 closed this May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/cron Cron scheduler and job management comp/gateway Gateway runner, session dispatch, delivery type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants