Feat/delegate model parameter by Labhund · Pull Request #7586 · NousResearch/hermes-agent

Labhund · 2026-04-11T07:04:00Z

Per-Task Model Parameter for delegate_task

Summary

delegate_task now accepts an optional model parameter at both the top level and per-task inside the tasks array. It routes each subagent through the same resolution pipeline as the /model slash command — aliases, direct mappings, catalog search, credential resolution, and cross-provider routing.

Key benefit: Enables subagent-driven development where the main agent picks model capability per task — haiku for lookups, sonnet for moderate reasoning, opus for the hard stuff, all in parallel, improving cost, speed, and context preservation in the parent.

What Changed

1. Core Implementation (tools/delegate_tool.py)

New Helper: `_resolve_model_override()`

Wraps hermes_cli.model_switch.switch_model to convert user-friendly model strings into full credential bundles
Handles three syntax forms:
- Bare model ID: "haiku", "sonnet", "glm-4.7"
- Short alias: "opus", "grok", "gemini"
- Provider switch: "stepfun/step-3.5-flash --provider openrouter"
Reuses the exact /model pipeline including alias resolution and provider credential lookup
Surfaces resolution errors as JSON tool_error messages so the LLM sees them and can retry

Updated `delegate_task()` Signature

def delegate_task(
    goal: Optional[str] = None,
    context: Optional[str] = None,
    model: Optional[str] = None,  # NEW: per-task model override
    tasks: Optional[list] = None,
    # ... other params
) -> dict

Model Precedence (per-task)

Per-task model in the task dict (highest priority)
Top-level model argument to delegate_task(...)
delegation.model config from ~/.hermes/config.yaml
Parent's model (fallback — inherit)

Task Normalization Loop

Each task in the batch now resolves its own credentials:

for task in tasks:
    task_model = task.get("model") or model or delegation_model or parent_model
    task_credentials = _resolve_model_override(task_model, parent_agent)
    # build child with task-specific model

DELEGATE_TASK_SCHEMA Updates

Added model field at top level with examples
Added model field inside tasks[].items schema
Explicit anti-pattern documentation: "DO NOT use colon-prefix syntax (e.g. 'openrouter:stepfun/...')"
Concrete examples of the --provider flag syntax to guide the LLM

2. Documentation (website/docs/user-guide/features/delegation.md)

New "Per-Task Model Selection" section covering:

Model parameter syntax (bare, alias, --provider switch)
Valid provider slugs list
Batch with mixed models example
Model precedence rules
Configuration fallback
Common patterns (cost optimization, rate-limit relief, OpenRouter :free variants)

Why This Matters

Subagent-Driven Development

Traditionally, all subagents inherit the parent's model. With per-task routing:

# Before: all subagents run on parent's model (expensive if parent is opus)
delegate_task(tasks=[
    {"goal": "Find TODOs in src/"},           # Could use cheap haiku
    {"goal": "Redesign auth flow"},           # Needs sonnet
    {"goal": "Analyze algorithm complexity"}  # Needs opus
])

# After: route each to the right capability level
delegate_task(tasks=[
    {"goal": "Find TODOs in src/", "model": "haiku"},
    {"goal": "Redesign auth flow", "model": "sonnet"},
    {"goal": "Analyze algorithm", "model": "opus"}
])

Benefits

Cost efficiency — cheap fast models for simple lookups, expensive models only where needed
Speed — haiku tasks complete in seconds, not waiting for opus
Context preservation — parent doesn't bloat with subagent conversation history; isolated contexts per model
Rate-limit relief — spread load across multiple providers in parallel
Provider agility — main agent can route to best-in-class models (e.g., grok for reasoning, glm-4.7 for coding)

How to Test

Unit Tests

All 83 tests pass (67 existing + 10 model override + 5 toolset scope + 1 schema regression guard):

pytest tests/tools/test_delegate.py -v

New test classes:

TestResolveModelOverride — model name parsing, provider flag, empty input, switch_model failures
TestDelegateTaskModelOverride — precedence rules, batch with mixed models, per-task beats top-level, config fallback, bad model returns JSON error, registry dispatch

Manual Smoke Test

This is the actual end-to-end test run with three models across providers:

./hermes chat

Then ask:

please in parallel delegate a haiku on three seperate animals to glm-4.7 via zai, 
glm-5.1 via not providing a provider and stepfun/step-3.5-flash via openrouter

Live Smoke Test Output

Initializing agent...
────────────────────────────────────────

┌─ Reasoning ───────────────────────────────────────────────────────────────────────────────────┐
The user wants me to delegate three parallel tasks, each requesting a haiku on a different
animal, to three different models:

glm-4.7 via zai provider
glm-5.1 with no provider specified (will use default)
stepfun/step-3.5-flash via openrouter provider

I need to use the delegate_task function with tasks parameter to run these in parallel.
Each task will be a delegation to a different model as specified.

Let me construct the tasks array with the proper model specifications.
└───────────────────────────────────────────────────────────────────────────────────────────────┘

▣ preparing delegate_task…

✓ [1/3] Write a haiku about a lion. (4.7s)
✓ [2/3] Write a haiku about a dolphin. (9.43s)
✓ [3/3] Write a haiku about a falcon. (12.78s)

┌─ Reasoning ───────────────────────────────────────────────────────────────────────────────────┐
All three tasks completed successfully. Let me present the results clearly.
└───────────────────────────────────────────────────────────────────────────────────────────────┘

╭─ ✿ Sakura ────────────────────────────────────────────────────────────────────────────────────╮

Done. All three haikus completed in parallel (14.08s total):

[GLM-4.7 via ZAI] - Lion:
Golden mane so bright
Roaring loud across the plains
King of all the land

[GLM-5.1] - Dolphin:
Silver leap through waves,
Joyful dance in ocean spray,
Playful sea-born friend.

[StepFun/Step-3.5-Flash via OpenRouter] - Falcon:
A falcon takes flight,
Wings cut the morning's chill air—
Sky's arrow returns.

Show more

Verification checklist from the smoke test:

✅ GLM-4.7 via Z.AI — correct model + provider routed
✅ GLM-5.1 (no provider) — correct model routed to default provider (Z.AI)
✅ StepFun via OpenRouter — correct model + provider routed
✅ All 3 subagents ran in parallel (14.08s total, not sequential 28s)
✅ Each subagent got correct credentials and produced output
✅ Schema examples were clear enough that the LLM used correct --provider syntax (not the invalid colon-prefix)

Design Decisions

Why `--provider <slug>` Instead of Colon-Prefix?

The colon is semantically reserved in hermes for OpenRouter's variant suffixes:

anthropic/claude-sonnet-4:thinking — variant tag for extended thinking
meta-llama/llama-3.3-70b:free — variant tag for free tier
google/gemini-2.5-pro:fast — variant tag for fast inference

A colon-prefix provider syntax (openrouter:stepfun/step-3.5-flash) would create ambiguous parses:

Is the first colon for provider routing or a variant tag?
If a model ID itself contains a colon (custom provider), how is it disambiguated?

The --provider <slug> approach is explicit and matches the /model command behavior already familiar to hermes users.

Anti-Pattern Documentation

The schema explicitly warns against colon-prefix syntax with concrete examples, so the LLM doesn't invent it again without prompting. Test test_schema_documents_provider_switch_syntax ensures this anti-pattern stays documented if the schema is refactored.

Error Handling

If model resolution fails (e.g., model not found, provider not authenticated):

_resolve_model_override() raises ValueError with a clear message
The error bubbles up as a JSON tool_error message
The LLM sees the error and can retry with a different model or provider

This is by design — it forces explicit errors instead of silent fallbacks that hide bugs.

Files Changed

tools/delegate_tool.py
  + _resolve_model_override() helper
  + delegate_task() signature: model param
  + _build_child_agent loop: per-task credential resolution
  + DELEGATE_TASK_SCHEMA: model field + examples + anti-pattern warning
  + Registry handler: forward args.get("model")

tests/tools/test_delegate.py
  + TestResolveModelOverride (5 tests)
  + TestDelegateTaskModelOverride (5 tests)
  + test_schema_documents_provider_switch_syntax (regression guard)

website/docs/user-guide/features/delegation.md
  + "Per-Task Model Selection" section (102 lines)
  + Model syntax, precedence, batch examples, patterns
  + Updated existing "Model Override" section

(No breaking changes, no existing tests regressed)

Commits

121c8b7 feat(delegate): add per-task model parameter for on-the-fly subagent routing
- Core plumbing: _resolve_model_override(), delegate_task signature, per-task credentials
- Schema updates with provider syntax examples
- 10 new tests, 67 existing tests all pass
75ddbeb fix(tools): clarify delegate_task model parameter syntax in schema
- Schema anti-pattern documentation (no colon-prefix)
- Concrete --provider flag examples
- Regression-guard test to keep examples stable
1bafd42 docs(delegation): document per-task model selection feature
- Comprehensive user documentation
- Model syntax, precedence, batch patterns, common use cases

Backwards Compatibility

✅ Existing delegation code works unchanged (model param is optional)
✅ Config fallback still works (delegation.model in config.yaml)
✅ No changes to toolsets, max_iterations, depth limit, or interruption behavior
✅ 67 existing tests still pass — no regressions

Platforms Tested

Linux (full end-to-end with GLM 4.7, GLM 5.1 on Z.AI, and Stepfun via OpenRouter)
Three parallel subagents, 14.08s total runtime
Cross-provider credential routing (Z.AI + OpenRouter)
Bash terminal, Python environment

Future Work (Out of Scope)

Automatic model selection based on task complexity (could layer on top of this)
Per-task provider config beyond just model string (low priority — model string is flexible enough)
Caching of model resolution to avoid repeated lookups (could be a micro-optimization)

Reviewer Notes

The anti-pattern warning in the schema is load-bearing — it prevents the LLM from inventing invalid syntax. If schema is refactored, keep the concrete examples and colon-prefix warning in place.
_resolve_model_override() reuses hermes_cli.model_switch.switch_model, so any future changes to model routing automatically flow through delegation. No duplication.
Error handling is deliberate: if model resolution fails, it surfaces as a JSON error so the LLM sees it and can retry. We don't silently fall back to the parent's model.
The smoke test demonstrates the real-world use case: three tasks on three models across two providers in parallel. If any provider routing fails, the error is clear and non-blocking for the other subagents.

…routing Lets the parent agent (or the LLM via tool-calling) pick a model per subagent invocation, using the same resolution pipeline as the /model slash command: aliases, direct mappings, catalog search, and credential resolution. Per-task model beats top-level model, which beats delegation.model config, which falls back to inheriting the parent. This unlocks cost/speed/capability routing for subagent-driven development — e.g. dispatch a haiku for a trivial lookup, a sonnet for a moderate refactor, and glm-4.7 for a bulk research task, all inside a single delegate_task batch call. Changes: - tools/delegate_tool.py - New _resolve_model_override() helper that wraps switch_model() and returns a credential bundle compatible with _build_child_agent's override_* params. Strips --global to ensure per-task overrides never persist to config.yaml. - delegate_task() gains an optional model= kwarg, threaded through task normalization and the child-build loop so each subagent can resolve credentials independently. - DELEGATE_TASK_SCHEMA advertises the new model field at the top level and inside each task object, with descriptions the LLM can use to decide when to route to which model. - Registry handler forwards args['model'] to delegate_task(). - tests/tools/test_delegate.py - TestResolveModelOverride covers bare name, --provider flag, the --global strip-but-ignore guarantee, switch_model failures, and empty input. - TestDelegateTaskModelOverride covers top-level override, per-task > top-level > delegation config precedence, no-override falls through to delegation config, bad model names surface as JSON errors, and the full registry dispatch path. All 82 delegate tests pass (67 existing + 10 new + 5 toolset scope).

The initial schema said only "supports optional --provider flag" without showing a concrete example. When asked to route a subagent through a different provider, the LLM reached for the intuitively natural 'provider:model' colon-prefix syntax (e.g. 'openrouter:stepfun/step-3.5-flash') — but colons in hermes are reserved for OpenRouter variant suffixes (:free, :extended, :thinking, :fast), so the colon-prefix form was passed raw to the parent's provider and rejected as an Unknown Model. Fix: - Top-level and per-task 'model' field descriptions now show three concrete syntax forms: bare ID, short alias, and '--provider <slug>' with worked examples (stepfun/step-3.5-flash --provider openrouter, claude-opus-4-6 --provider anthropic, deepseek-chat --provider deepseek). - Valid provider slugs are enumerated so the LLM doesn't have to guess. - The colon-prefix anti-pattern is explicitly called out as DO NOT with an example, since LLMs gravitate toward it. This keeps delegate_task consistent with the existing /model slash command, which also uses --provider exclusively (see hermes_cli/model_switch.py:16-18). - Main description MODEL SELECTION bullet updated with the same examples. - New TestDelegateRequirements.test_schema_documents_provider_switch_syntax regression guard asserts the concrete --provider example and colon-prefix anti-pattern stay in the schema across future refactors. Behaviour unchanged; this is a schema-description-only fix. All 83 delegate tests pass.

malaiwah · 2026-04-11T20:48:27Z

We independently built the same feature on our fork (oikos homelab deployment) and can confirm the per-task model parameter works well in practice.

A few additions from our experience that might be worth considering:

1. Model tiers (small/medium/large)

We added delegation.model_tiers config so the agent can say model="small" instead of needing to know exact model names:

delegation:
  model_tiers:
    small: gemma4-nothink    # fast/cheap — file exploration, summarization
    # medium: inherits parent model
    large: claude-sonnet-4-6  # complex reasoning, peer review, escalation

Tier names resolve to configured model names. The agent doesn't need to know deployment-specific model identifiers.

2. list_models tool

A lightweight tool that returns available models with tier assignments, context lengths, and providers. Lets the agent make informed decisions about which model to use for each delegation.

3. Why this matters more than smart_model_routing

We tried upstream's smart_model_routing (message-length heuristic for cheap model routing) and disabled it after production testing. Short messages like "yes" or "go ahead" often trigger the most complex operations. Message length is a terrible proxy for task complexity.

The model-directed approach (this PR) is fundamentally better — the model has full context and knows when a task is simple enough for a smaller model. We've seen it correctly route file exploration to Gemma 4 27B and keep complex debugging on Qwen 3.5 397B.

+1 for merging this. The per-task model override in batch mode is especially valuable for mixed workloads.

xlionjuan · 2026-04-19T21:18:52Z

I don't think it should be called model_tiers, it just an alias, and you could call it whatever you want.

claude added 3 commits April 11, 2026 06:19

docs(delegation): document per-task model selection feature

1bafd42

This was referenced Apr 11, 2026

feat: delegation model tiers + list_models tool (builds on #7586) #7929

Open

is_local_endpoint misses Docker/Podman DNS names — stale timeout fires on local LLM proxies #7905

Closed

feat(delegate): model tiers + list_models tool (builds on #7586) #7957

Open

Hubedge mentioned this pull request Apr 25, 2026

feat(config): introduce reusable model definitions to eliminate repetition across delegation/model/auxiliary #15593

Closed

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/delegate Subagent delegation labels Apr 29, 2026

This was referenced Apr 30, 2026

Feature: Allow per-task model selection in delegate_task #6306

Open

feat(todo): add model and provider fields for per-task model override #18880

Closed

feat(delegation): per-task model/provider overrides in batch dispatch #20779

Closed

ozdalva mentioned this pull request May 13, 2026

feat(delegate): per-task model and provider override for delegate_task subagents #25026

Closed

alt-glitch mentioned this pull request May 21, 2026

feat: add model and provider parameters to delegate_task for dynamic model routing #29899

Closed

jmche mentioned this pull request Jun 1, 2026

feat(delegate): add per-task model/provider routing #36790

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/delegate model parameter#7586

Feat/delegate model parameter#7586
Labhund wants to merge 3 commits into
NousResearch:mainfrom
Labhund:feat/delegate-model-parameter

Labhund commented Apr 11, 2026

Uh oh!

malaiwah commented Apr 11, 2026

Uh oh!

xlionjuan commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Labhund commented Apr 11, 2026

Per-Task Model Parameter for delegate_task

Summary

What Changed

1. Core Implementation (tools/delegate_tool.py)

New Helper: _resolve_model_override()

Updated delegate_task() Signature

Model Precedence (per-task)

Task Normalization Loop

DELEGATE_TASK_SCHEMA Updates

2. Documentation (website/docs/user-guide/features/delegation.md)

Why This Matters

Subagent-Driven Development

Benefits

How to Test

Unit Tests

Manual Smoke Test

Live Smoke Test Output

Design Decisions

Why --provider <slug> Instead of Colon-Prefix?

Anti-Pattern Documentation

Error Handling

Files Changed

Commits

Backwards Compatibility

Platforms Tested

Future Work (Out of Scope)

Reviewer Notes

Uh oh!

malaiwah commented Apr 11, 2026

Uh oh!

xlionjuan commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

New Helper: `_resolve_model_override()`

Updated `delegate_task()` Signature

Why `--provider <slug>` Instead of Colon-Prefix?