Feat/delegate model parameter#7586
Conversation
…routing
Lets the parent agent (or the LLM via tool-calling) pick a model per
subagent invocation, using the same resolution pipeline as the /model
slash command: aliases, direct mappings, catalog search, and credential
resolution. Per-task model beats top-level model, which beats
delegation.model config, which falls back to inheriting the parent.
This unlocks cost/speed/capability routing for subagent-driven
development — e.g. dispatch a haiku for a trivial lookup, a sonnet for
a moderate refactor, and glm-4.7 for a bulk research task, all inside
a single delegate_task batch call.
Changes:
- tools/delegate_tool.py
- New _resolve_model_override() helper that wraps switch_model() and
returns a credential bundle compatible with _build_child_agent's
override_* params. Strips --global to ensure per-task overrides
never persist to config.yaml.
- delegate_task() gains an optional model= kwarg, threaded through
task normalization and the child-build loop so each subagent can
resolve credentials independently.
- DELEGATE_TASK_SCHEMA advertises the new model field at the top
level and inside each task object, with descriptions the LLM can
use to decide when to route to which model.
- Registry handler forwards args['model'] to delegate_task().
- tests/tools/test_delegate.py
- TestResolveModelOverride covers bare name, --provider flag, the
--global strip-but-ignore guarantee, switch_model failures, and
empty input.
- TestDelegateTaskModelOverride covers top-level override, per-task
> top-level > delegation config precedence, no-override falls
through to delegation config, bad model names surface as JSON
errors, and the full registry dispatch path.
All 82 delegate tests pass (67 existing + 10 new + 5 toolset scope).
The initial schema said only "supports optional --provider flag" without showing a concrete example. When asked to route a subagent through a different provider, the LLM reached for the intuitively natural 'provider:model' colon-prefix syntax (e.g. 'openrouter:stepfun/step-3.5-flash') — but colons in hermes are reserved for OpenRouter variant suffixes (:free, :extended, :thinking, :fast), so the colon-prefix form was passed raw to the parent's provider and rejected as an Unknown Model. Fix: - Top-level and per-task 'model' field descriptions now show three concrete syntax forms: bare ID, short alias, and '--provider <slug>' with worked examples (stepfun/step-3.5-flash --provider openrouter, claude-opus-4-6 --provider anthropic, deepseek-chat --provider deepseek). - Valid provider slugs are enumerated so the LLM doesn't have to guess. - The colon-prefix anti-pattern is explicitly called out as DO NOT with an example, since LLMs gravitate toward it. This keeps delegate_task consistent with the existing /model slash command, which also uses --provider exclusively (see hermes_cli/model_switch.py:16-18). - Main description MODEL SELECTION bullet updated with the same examples. - New TestDelegateRequirements.test_schema_documents_provider_switch_syntax regression guard asserts the concrete --provider example and colon-prefix anti-pattern stay in the schema across future refactors. Behaviour unchanged; this is a schema-description-only fix. All 83 delegate tests pass.
|
We independently built the same feature on our fork (oikos homelab deployment) and can confirm the per-task model parameter works well in practice. A few additions from our experience that might be worth considering: 1. Model tiers (small/medium/large) We added delegation:
model_tiers:
small: gemma4-nothink # fast/cheap — file exploration, summarization
# medium: inherits parent model
large: claude-sonnet-4-6 # complex reasoning, peer review, escalationTier names resolve to configured model names. The agent doesn't need to know deployment-specific model identifiers. 2. A lightweight tool that returns available models with tier assignments, context lengths, and providers. Lets the agent make informed decisions about which model to use for each delegation. 3. Why this matters more than We tried upstream's The model-directed approach (this PR) is fundamentally better — the model has full context and knows when a task is simple enough for a smaller model. We've seen it correctly route file exploration to Gemma 4 27B and keep complex debugging on Qwen 3.5 397B. +1 for merging this. The per-task model override in batch mode is especially valuable for mixed workloads. |
|
I don't think it should be called model_tiers, it just an alias, and you could call it whatever you want. |
Per-Task Model Parameter for delegate_task
Summary
delegate_tasknow accepts an optionalmodelparameter at both the top level and per-task inside thetasksarray. It routes each subagent through the same resolution pipeline as the/modelslash command — aliases, direct mappings, catalog search, credential resolution, and cross-provider routing.Key benefit: Enables subagent-driven development where the main agent picks model capability per task — haiku for lookups, sonnet for moderate reasoning, opus for the hard stuff, all in parallel, improving cost, speed, and context preservation in the parent.
What Changed
1. Core Implementation (tools/delegate_tool.py)
New Helper:
_resolve_model_override()hermes_cli.model_switch.switch_modelto convert user-friendly model strings into full credential bundles"haiku","sonnet","glm-4.7""opus","grok","gemini""stepfun/step-3.5-flash --provider openrouter"/modelpipeline including alias resolution and provider credential lookuptool_errormessages so the LLM sees them and can retryUpdated
delegate_task()SignatureModel Precedence (per-task)
modelin the task dict (highest priority)modelargument todelegate_task(...)delegation.modelconfig from~/.hermes/config.yamlTask Normalization Loop
Each task in the batch now resolves its own credentials:
DELEGATE_TASK_SCHEMA Updates
modelfield at top level with examplesmodelfield insidetasks[].itemsschema--providerflag syntax to guide the LLM2. Documentation (website/docs/user-guide/features/delegation.md)
New "Per-Task Model Selection" section covering:
--providerswitch)Why This Matters
Subagent-Driven Development
Traditionally, all subagents inherit the parent's model. With per-task routing:
Benefits
How to Test
Unit Tests
All 83 tests pass (67 existing + 10 model override + 5 toolset scope + 1 schema regression guard):
New test classes:
TestResolveModelOverride— model name parsing, provider flag, empty input, switch_model failuresTestDelegateTaskModelOverride— precedence rules, batch with mixed models, per-task beats top-level, config fallback, bad model returns JSON error, registry dispatchManual Smoke Test
This is the actual end-to-end test run with three models across providers:
Then ask:
Live Smoke Test Output
Verification checklist from the smoke test:
--providersyntax (not the invalid colon-prefix)Design Decisions
Why
--provider <slug>Instead of Colon-Prefix?The colon is semantically reserved in hermes for OpenRouter's variant suffixes:
anthropic/claude-sonnet-4:thinking— variant tag for extended thinkingmeta-llama/llama-3.3-70b:free— variant tag for free tiergoogle/gemini-2.5-pro:fast— variant tag for fast inferenceA colon-prefix provider syntax (
openrouter:stepfun/step-3.5-flash) would create ambiguous parses:The
--provider <slug>approach is explicit and matches the/modelcommand behavior already familiar to hermes users.Anti-Pattern Documentation
The schema explicitly warns against colon-prefix syntax with concrete examples, so the LLM doesn't invent it again without prompting. Test
test_schema_documents_provider_switch_syntaxensures this anti-pattern stays documented if the schema is refactored.Error Handling
If model resolution fails (e.g., model not found, provider not authenticated):
_resolve_model_override()raisesValueErrorwith a clear messagetool_errormessageThis is by design — it forces explicit errors instead of silent fallbacks that hide bugs.
Files Changed
Commits
121c8b7
feat(delegate): add per-task model parameter for on-the-fly subagent routing_resolve_model_override(), delegate_task signature, per-task credentials75ddbeb
fix(tools): clarify delegate_task model parameter syntax in schema--providerflag examples1bafd42
docs(delegation): document per-task model selection featureBackwards Compatibility
delegation.modelin config.yaml)Platforms Tested
Future Work (Out of Scope)
Reviewer Notes
The anti-pattern warning in the schema is load-bearing — it prevents the LLM from inventing invalid syntax. If schema is refactored, keep the concrete examples and colon-prefix warning in place.
_resolve_model_override()reuseshermes_cli.model_switch.switch_model, so any future changes to model routing automatically flow through delegation. No duplication.Error handling is deliberate: if model resolution fails, it surfaces as a JSON error so the LLM sees it and can retry. We don't silently fall back to the parent's model.
The smoke test demonstrates the real-world use case: three tasks on three models across two providers in parallel. If any provider routing fails, the error is clear and non-blocking for the other subagents.