Skip to content

feat(delegate): per-call model/provider override for subagents#3172

Closed
ReqX wants to merge 3 commits into
NousResearch:mainfrom
ReqX:feature/delegate-per-call-model
Closed

feat(delegate): per-call model/provider override for subagents#3172
ReqX wants to merge 3 commits into
NousResearch:mainfrom
ReqX:feature/delegate-per-call-model

Conversation

@ReqX

@ReqX ReqX commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Allow delegate_task() to accept model and provider parameters at call time, overriding config.yaml delegation settings for each invocation.

Resolution priority: task-level > top-level > config.yaml > parent agent inherit

Motivation

Previously, subagent model/provider could only be set globally via delegation.model / delegation.provider in config.yaml. This PR enables the LLM (or tool callers) to dynamically route specific subagents to different models/providers per invocation — e.g., delegate cheap fast tasks to Gemini Flash while keeping the parent on Claude.

Changes

tools/delegate_tool.py (+74, -15)

  • Add model and provider optional params to delegate_task() signature
  • Add model/provider to top-level and batch task item schema properties
  • Move credential resolution into per-task loop for per-task overrides in batch mode
  • Extend _resolve_delegation_credentials() with override_model / override_provider params
  • Per-call values take precedence over config; config is used as fallback

run_agent.py (+4)

  • Fix: Pass model and provider through both hardcoded delegate_task dispatch paths
    • _execute_tool_call() — the single tool call dispatch (~line 4957)
    • _execute_tool_calls_concurrent() — the concurrent execution path (~line 5309)
  • These two call sites bypass the tool registry and call _delegate_task() directly.
    Without this fix, per-call model/provider were silently dropped, causing all
    subagents to fall back to the parent model regardless of override values.

tests/tools/test_delegate.py (+249)

  • New TestPerCallModelProviderOverride class with 12 tests covering:
    • Schema validation (top-level + batch items)
    • Per-call model/provider pass-through to credential resolution
    • Batch per-task model override
    • Task-level overrides top-level
    • Config fallback when no override
    • None override does not clear config
    • Credential error with per-call override
    • Backward compatibility (no regressions)

skills/autonomous-ai-agents/model-routing-template/ (new)

  • Strategic delegation framework skill — the "when and why" companion to the "how"
  • Includes providers-example.yaml with a worked multi-provider catalog
  • Provider-agnostic: users fill in their own models, costs, and roles

Test Results

61/61 tests pass (49 existing + 12 new, zero regressions)

Live verification: Tested against running Hermes instance with:

  • delegate_task(provider="zai", model="glm-5") — correctly routes to glm-5
  • delegate_task(provider="copilot", model="claude-sonnet-4.6") — correctly routes through GitHub Copilot
  • Batch mode with per-task overrides — both tasks run on correct model/provider in parallel

Backward Compatibility

Fully backward compatible. All new parameters are optional with None defaults. Existing behavior (config-based delegation, parent inheritance) is unchanged when no per-call overrides are provided.

ReqX added 3 commits March 26, 2026 09:28
Allow delegate_task() to accept model and provider parameters at call
time, overriding config.yaml delegation settings for each invocation.

Resolution priority: task-level > top-level > config > parent inherit.

Changes:
- Add model/provider to delegate_task() signature and schema
- Add model/provider to batch task items schema
- Move credential resolution into per-task loop for per-task overrides
- Extend _resolve_delegation_credentials() with override params
- Add 12 tests for per-call override behavior (61 total, all passing)
…patch paths

run_agent.py has two direct call sites for delegate_task that bypass
the tool registry. These were not updated in the original commit and
silently dropped per-call model/provider overrides, causing all
subagents to fall back to the parent model.

Fix: pass model/function_args.get('model') and
provider/function_args.get('provider') at both call sites:
- _execute_tool_call() (single dispatch, ~line 4957)
- _execute_tool_calls_concurrent() (concurrent dispatch, ~line 5309)

Verified: glm-5, copilot/claude-sonnet-4.6, and batch mode all
correctly route to the specified model/provider.
…work

Generic template for routing subagent tasks to the right model based on
cost, capability, and complexity. Includes:

- Decision framework: when to delegate vs handle yourself
- 4-tier escalation model: cheap → standard → expensive → frontier
- Cost-effective patterns: 'many for one', 'free stack', 'scout party'
- Engaged vs autonomous mode: present tradeoffs when user is present,
  follow ladder strictly when not
- providers-example.yaml: worked example catalog showing how to define
  models, costs, roles, user override phrases, and delegation patterns

Designed to be provider-agnostic — users fill in their own catalog.
Complements the per-call model/provider override feature in this PR
(NousResearch#3172) by providing the strategic 'when and why' to the technical
'how'.
@1shanra1

Copy link
Copy Markdown

I was going to make a PR for this exact feature! Nice to see someone already did it :) This should get merged.

@ReqX

ReqX commented Mar 27, 2026

Copy link
Copy Markdown
Contributor Author

Happy someone else would find it useful. It works quite well for the current limited scope.

Was thinking of adding pre defined personas to it (something like OMO-slim), but kept it low for easier merge at this point, and you can prompt it anyway or add some more instructions to the skill.

@SHL0MS

SHL0MS commented Mar 31, 2026

Copy link
Copy Markdown
Collaborator

Another user request for this: "I'd like to change models of subagent by prompting alone — 'Launch subagent to do XYZ with model ABC'." The delegate_task schema (line 713-774) currently has no model parameter. The user can configure a static override in config.yaml under delegation but cannot specify the model per-call in the prompt. See also #3719, #3794, #3984.

@HenkDz

HenkDz commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Another user request for this: "I'd like to change models of subagent by prompting alone — 'Launch subagent to do XYZ with model ABC'." The delegate_task schema (line 713-774) currently has no model parameter. The user can configure a static override in config.yaml under delegation but cannot specify the model per-call in the prompt. See also #3719, #3794, #3984.

The #3719 si closed in favor of #3794
The solution in the latter is better and way simpler:

image

@ReqX

ReqX commented Apr 1, 2026

Copy link
Copy Markdown
Contributor Author

I honestly like #3794, but is it exclusive? The strength path is powerfull but giving us the opp to invoke a specific model is sure valuable for power users?

I can rebase/check how it could fit in if there is interest.

@HenkDz

HenkDz commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

I honestly like #3794, but is it exclusive?

What do you mean by "Exclusive"?

giving us the opp to invoke a specific model is sure valuable for power users?

Most certainly, at least in my case, I would like to the coding agent to be super powrful, and different from the design one. The default chat one would be different from both and so on.

@ReqX

ReqX commented Apr 2, 2026

Copy link
Copy Markdown
Contributor Author

Overlooked the "Per-call model/provider override" in your PR ;)

That combined with the "strength-routing" is great and makes my PR most likely obsolete (didnt look at the code, on the road).

@iRonin

iRonin commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Great work on this — I opened #4833 which builds on the same model/provider approach and adds two more layers:

  • Skill-level model routing: delegate_task(skill='code-review') reads model: from the skill's SKILL.md frontmatter automatically
  • supervisor_model / execution_model config aliases for clear multi-agent intent in config.yaml

Happy to consolidate if maintainers prefer a single PR.

iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 5, 2026
Three interconnected features for multi-agent model routing:

1. delegate_task() — model/provider params (like PR NousResearch#3172)
   - model, provider: per-call overrides for all subagents in the call
   - Per-task model/provider in batch tasks array
   - Priority: per-task > call-level > delegation config > parent inherit

2. delegate_task() — skill/skills params (new)
   - skill: load a single named skill into the subagent's context
   - skills: load multiple skills
   - Skill SKILL.md content is prepended to the subagent system prompt
   - If SKILL.md frontmatter contains model: or provider: fields, those
     are used as the model/provider for that subagent call (overridable
     by the explicit model param)

3. config.yaml — supervisor_model / execution_model aliases
   - delegation.supervisor_model → model.default (main agent)
   - delegation.execution_model  → delegation.model (all subagents)
   - Resolved during config load; explicit model.default / delegation.model
     always take precedence if both are set
   - Allows clear intent in config:
       delegation:
         supervisor_model: anthropic/claude-opus-4-6
         execution_model: google/gemini-flash-1.5

   Skill frontmatter example:
       ---
       name: code-review
       model: anthropic/claude-opus-4-6
       provider: anthropic
       ---

   Usage example (agent tool call):
       delegate_task(goal='review this PR', skill='code-review')
       delegate_task(goal='summarise docs', model='google/gemini-flash-1.5')
       delegate_task(tasks=[
         {goal: 'fast research', model: 'google/gemini-flash-1.5'},
         {goal: 'careful code', model: 'anthropic/claude-opus-4-6'},
       ])

Builds on / references PR NousResearch#3172 (ReqX) for the model/provider param idea.
iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 9, 2026
Three interconnected features for multi-agent model routing:

1. delegate_task() — model/provider params (like PR NousResearch#3172)
   - model, provider: per-call overrides for all subagents in the call
   - Per-task model/provider in batch tasks array
   - Priority: per-task > call-level > delegation config > parent inherit

2. delegate_task() — skill/skills params (new)
   - skill: load a single named skill into the subagent's context
   - skills: load multiple skills
   - Skill SKILL.md content is prepended to the subagent system prompt
   - If SKILL.md frontmatter contains model: or provider: fields, those
     are used as the model/provider for that subagent call (overridable
     by the explicit model param)

3. config.yaml — supervisor_model / execution_model aliases
   - delegation.supervisor_model → model.default (main agent)
   - delegation.execution_model  → delegation.model (all subagents)
   - Resolved during config load; explicit model.default / delegation.model
     always take precedence if both are set
   - Allows clear intent in config:
       delegation:
         supervisor_model: anthropic/claude-opus-4-6
         execution_model: google/gemini-flash-1.5

   Skill frontmatter example:
       ---
       name: code-review
       model: anthropic/claude-opus-4-6
       provider: anthropic
       ---

   Usage example (agent tool call):
       delegate_task(goal='review this PR', skill='code-review')
       delegate_task(goal='summarise docs', model='google/gemini-flash-1.5')
       delegate_task(tasks=[
         {goal: 'fast research', model: 'google/gemini-flash-1.5'},
         {goal: 'careful code', model: 'anthropic/claude-opus-4-6'},
       ])

Builds on / references PR NousResearch#3172 (ReqX) for the model/provider param idea.
iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 11, 2026
Three interconnected features for multi-agent model routing:

1. delegate_task() — model/provider params (like PR NousResearch#3172)
   - model, provider: per-call overrides for all subagents in the call
   - Per-task model/provider in batch tasks array
   - Priority: per-task > call-level > delegation config > parent inherit

2. delegate_task() — skill/skills params (new)
   - skill: load a single named skill into the subagent's context
   - skills: load multiple skills
   - Skill SKILL.md content is prepended to the subagent system prompt
   - If SKILL.md frontmatter contains model: or provider: fields, those
     are used as the model/provider for that subagent call (overridable
     by the explicit model param)

3. config.yaml — supervisor_model / execution_model aliases
   - delegation.supervisor_model → model.default (main agent)
   - delegation.execution_model  → delegation.model (all subagents)
   - Resolved during config load; explicit model.default / delegation.model
     always take precedence if both are set
   - Allows clear intent in config:
       delegation:
         supervisor_model: anthropic/claude-opus-4-6
         execution_model: google/gemini-flash-1.5

   Skill frontmatter example:
       ---
       name: code-review
       model: anthropic/claude-opus-4-6
       provider: anthropic
       ---

   Usage example (agent tool call):
       delegate_task(goal='review this PR', skill='code-review')
       delegate_task(goal='summarise docs', model='google/gemini-flash-1.5')
       delegate_task(tasks=[
         {goal: 'fast research', model: 'google/gemini-flash-1.5'},
         {goal: 'careful code', model: 'anthropic/claude-opus-4-6'},
       ])

Builds on / references PR NousResearch#3172 (ReqX) for the model/provider param idea.
iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 12, 2026
Three interconnected features for multi-agent model routing:

1. delegate_task() — model/provider params (like PR NousResearch#3172)
   - model, provider: per-call overrides for all subagents in the call
   - Per-task model/provider in batch tasks array
   - Priority: per-task > call-level > delegation config > parent inherit

2. delegate_task() — skill/skills params (new)
   - skill: load a single named skill into the subagent's context
   - skills: load multiple skills
   - Skill SKILL.md content is prepended to the subagent system prompt
   - If SKILL.md frontmatter contains model: or provider: fields, those
     are used as the model/provider for that subagent call (overridable
     by the explicit model param)

3. config.yaml — supervisor_model / execution_model aliases
   - delegation.supervisor_model → model.default (main agent)
   - delegation.execution_model  → delegation.model (all subagents)
   - Resolved during config load; explicit model.default / delegation.model
     always take precedence if both are set
   - Allows clear intent in config:
       delegation:
         supervisor_model: anthropic/claude-opus-4-6
         execution_model: google/gemini-flash-1.5

   Skill frontmatter example:
       ---
       name: code-review
       model: anthropic/claude-opus-4-6
       provider: anthropic
       ---

   Usage example (agent tool call):
       delegate_task(goal='review this PR', skill='code-review')
       delegate_task(goal='summarise docs', model='google/gemini-flash-1.5')
       delegate_task(tasks=[
         {goal: 'fast research', model: 'google/gemini-flash-1.5'},
         {goal: 'careful code', model: 'anthropic/claude-opus-4-6'},
       ])

Builds on / references PR NousResearch#3172 (ReqX) for the model/provider param idea.
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related/competing: #12794 (same feature, rebased onto dispatch refactor). Also see closed #3750 and open issue #5012.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing PR #12794.

@ReqX

ReqX commented May 24, 2026

Copy link
Copy Markdown
Contributor Author

PR #3172 vs #12794

1. Core Feature Parity

Functionally identical resolution chain: per-task > top-level > config > parent inherit. The mechanism is the same, extend _resolve_delegation_credentials() with optional override params, thread them through the schema/signature. No meaningful behavioral difference in the core override itself.

2. Merge-Readiness

Very obvious, right ;)

4. Test Coverage

Dimension #3172 #12794
Core override tests 12 ~12 equivalent
Plugin lifecycle 0 41
Edge cases (auto-router, mismatch) 0 Yes
E2E verification No Yes (live results cited)

#12794 is substantially better tested, especially for the failure modes that matter in production (router resolution mismatches, TTL stash bleed).

Summary

#3172 #12794
Core feature
Rebased on current main
Maintainer triage labels No P2 + comp tags
Observability/verification No Yes
Mismatch guards No Yes
Test depth Adequate Comprehensive
Active maintenance Stale (March) Active (May 22)

Bottom line: #12794 is the better PR on every axis except scope minimalism, this is up to maintainers.

Closing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have tool/delegate Subagent delegation type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants