Skip to content

feat(delegate): add per-task model/provider routing#36790

Open
jmche wants to merge 1 commit into
NousResearch:mainfrom
jmche:feat/delegate-task-model-provider-routing
Open

feat(delegate): add per-task model/provider routing#36790
jmche wants to merge 1 commit into
NousResearch:mainfrom
jmche:feat/delegate-task-model-provider-routing

Conversation

@jmche

@jmche jmche commented Jun 1, 2026

Copy link
Copy Markdown

Summary

Adds per-call and per-task model/provider routing to delegate_task.

This lets delegated subagents run on different models in the same batch, while preserving the existing fallback behavior when no override is provided.

Fixes #6306
Fixes #10995
Fixes #5012

Related to #7586, #6771, #12715, #20000, #23266, #34773, and #35033. Also addresses the delegate_task portion discussed in #18880.

Builds on behavior already fixed by merged PRs #17587 and #19623.

Design

Model/provider precedence is:

  1. Per-task tasks[].model / tasks[].provider
  2. Top-level model / provider
  3. delegation.model / delegation.provider in config
  4. Parent agent model/provider

The override resolver reuses the existing /model switch pipeline, so aliases, provider catalogs, custom providers, and --provider syntax stay consistent.

Both forms are supported:

delegate_task(goal="Review logs", model="haiku")

delegate_task(
    goal="Review architecture",
    model="sonnet",
    provider="anthropic",
)

delegate_task(
    goal="Quick OpenRouter check",
    model="llama-3.3-70b:free --provider openrouter",
)

For JSON tool calls, the structured provider field is preferred. If a structured provider conflicts with an inline --provider in model, delegate_task returns a tool error instead of silently choosing one.

Relation to Other PRs

There are several overlapping PRs in this feature cluster. This PR intentionally includes both top-level and per-task routing because #6306, #10995, and #5012 request single-call shorthand as well as mixed-model batch delegation.

Compared with the narrower per-task-only implementations, this PR also routes explicit overrides through the existing /model switch pipeline. That keeps alias handling, provider catalog lookup, custom providers, and inline --provider syntax consistent with the rest of Hermes instead of adding a parallel resolver.

This PR does not expose per-task base_url or api_key; provider credentials still come from configured Hermes providers.

Changes

  • Adds optional top-level model and provider parameters to delegate_task.
  • Adds optional per-task model and provider fields in tasks[].
  • Resolves explicit per-task overrides before child construction.
  • Fixes the same-provider model-only case, where a task-level model override must still take effect even when the provider does not change.
  • Forwards model and provider through the registry handler.
  • Documents the supported syntax and precedence.
  • Adds unit coverage for schema shape, resolver behavior, precedence, registry forwarding, error handling, and no-override fallback.

Verification

Automated:

py_compile tools/delegate_tool.py: passed
ruff check tools/delegate_tool.py tests/tools/test_delegate.py --no-cache: passed
pytest -q tests/tools/test_delegate.py: 146 passed, 1 warning
pytest -q tests/tools/test_delegate*.py: 163 passed, 1 warning
git diff --check: passed

Manual Hermes verification with real models:

Parent model:

  • opencode-go/deepseek-v4-pro

Parallel child models:

  • gemini/gemini-3.5-flash -> CHILD_OK_GEMINI
  • opencode-go/mimo-v2.5-pro -> CHILD_OK_MIMO
  • opencode-go/qwen3.6-plus -> CHILD_OK_QWEN

Result: 3/3 completed, total wall time about 6.36s.

@alt-glitch alt-glitch added type/feature New feature or request tool/delegate Subagent delegation P3 Low — cosmetic, nice to have labels Jun 1, 2026

@mxnstrexgl mxnstrexgl left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Automated PR Review

Security Scan

  • ✓ No hardcoded secrets, injection sinks, unsafe deserialization, or dependency red flags found by this automated scan.

Code Quality

  • ✓ No blocking code-quality issues found by this automated scan.

Summary

Status: APPROVE — security findings: 0, quality suggestions: 0.

Automated review; raw diff content intentionally omitted.

@gskeyzer-web

Copy link
Copy Markdown

✅ Rebasse vérifié — prêt à merger

J'ai cherry-pické cette PR il y a plusieurs jours et je l'utilise en production sur mon fork. Fonctionne parfaitement.

Aujourd'hui, j'ai vérifié le rebase sur le main actuel (commit b1b0f4b66) :

  • Rebase : sans conflit (0 conflit sur 3 fichiers modifiés)
  • py_compile : OK
  • Statistiques : +565 lignes, -12 lignes sur 3 fichiers
    • tools/delegate_tool.py — handler + resolver
    • tests/tools/test_delegate.py — tests unitaires
    • website/docs/user-guide/features/delegation.md — documentation

Branche rebasée disponible ici : pr36790-rebased sur mon fork (commit 04eb63e30)

La PR répond à un vrai besoin (cf. issues #5012, #6306, #10995 toutes marquées comme fixed by this PR). Le code est approuvé par @mxnstrexgl, les tests passent, le rebase est clean.

Qu'est-ce qui bloque encore le merge ?

@impiyush

impiyush commented Jun 5, 2026

Copy link
Copy Markdown

Great work on this — the approach of routing through the /model switch pipeline is architecturally the right call. It means aliases, custom providers, and --provider inline syntax all work without parallel resolution logic. That's cleaner than both the synthetic config-dict approach in #34773 and a custom resolver I was sketching out.

I run a production Hermes instance that uses delegate_task daily (deepseek-v4-flash as the daily driver, pro-tier models on explicit request). I'd be happy to pull your branch and test it against real workloads — Matrix-based single-task delegation, mixed-model batch delegation, and the model-only-on-same-provider case.

A few questions as I review:

  1. The single goal path injects top-level model/provider into the task dict — does that mean top-level overrides flow through the same _resolve_model_provider_override() codepath as per-task?
  2. Have you tested the "provider only" case where the provider is resolved but uses its own default model (not an explicit model override)?
  3. Anything specific you'd like help with to get this merged?

Happy to dig into the branch and report back.

@jmche

jmche commented Jun 9, 2026

Copy link
Copy Markdown
Author

@impiyush Thanks for the thoughtful review and for offering to test this against real workloads.

  1. Yes. The single-goal path normalizes the top-level model / provider into the task payload, so it goes through the same _resolve_model_provider_override() path as per-task overrides. That keeps the single-task shorthand and mixed batch delegation behavior consistent.

  2. Good question. The intended behavior for provider-only is that the provider is resolved through the existing model/provider switch pipeline, and the provider’s default model is used when no explicit model override is supplied. I’ll double-check whether there is already a focused unit test for that exact case; if not, I agree it is worth adding one.

  3. The most useful help would be real-world smoke testing on:

  • Matrix-based single-task delegation
  • mixed-model batch delegation
  • model-only override on the same provider
  • provider-only override using the provider default model
  • conflict handling when structured provider disagrees with inline --provider

If you hit anything unexpected, a minimal delegate_task call plus the relevant provider config shape would be very helpful.

Also thanks @gskeyzer-web for rebasing and validating this on current main. From my side the branch is still available, and I’m happy to refresh/rebase it if maintainers prefer the original PR branch to be updated directly.

agt-user pushed a commit to agt-user/hermes-agent that referenced this pull request Jun 9, 2026
Allow individual tasks in the  array to specify their own
 and/or , overriding the global delegation config
for that task only.

When a per-task override is present,
is called with a task-scoped config so that base_url, api_key, and
api_mode are derived correctly from the per-task provider — not the
global delegation config. Tasks without overrides fall back to the
pre-resolved  dict as before (no regression for existing usage).

Changes:
- : add  and  fields to the
  per-task object inside  array
-  loop: resolve per-task credentials when override
  is present; otherwise reuse global  (zero overhead)

Closes NousResearch#35437
Related: NousResearch#34489, NousResearch#31537, NousResearch#36790, NousResearch#30388, NousResearch#37966
@impiyush

impiyush commented Jun 9, 2026

Copy link
Copy Markdown

@jmche Thanks for working on this — I smoke-tested the patch on my self-hosted Hermes instance (v0.16.0). Assuming I am patching it correctly - the batch path works (resolves to 401 auth issue though), but the single-goal path is not applying model overrides.

Environment:

  • Hermes v0.16.0 (commit 3c231eb3)
  • Applied via curl -L https://github.com/NousResearch/hermes-agent/pull/36790.diff | git apply onto current main (clean, no conflicts)
  • Provider: ollama-cloud (default model: deepseek-v4-flash)
  • Platform: Matrix gateway
  • Docker Sandbox ON

Test Results:

# Test Call Model resolved Auth Verdict
1 Single-goal, model only delegate_task(goal="...", model="kimi-k2.6") deepseek-v4-flash N/A (default) Override not applied
2 Single-goal, model + provider delegate_task(goal="...", model="deepseek-v4-pro", provider="ollama-cloud") deepseek-v4-flash N/A (default) Override not applied
3 Batch, per-task model delegate_task(tasks=[{"goal": "...", "model": "kimi-k2.6"}]) kimi-k2.6 401 (credential scope) Model resolved correctly
4 Batch, per-task model delegate_task(tasks=[{"goal": "...", "model": "deepseek-v4-pro"}]) deepseek-v4-pro 401 (credential scope) Model resolved correctly

The pattern is consistent across three different test models (deepseek-v4-pro, gemma4:31b, kimi-k2.6). Every single-goal call falls through to the default model, while the batch path resolves overrides correctly every time.

This suggests the normalization at the single-goal entry point (where goal, model, provider, ... is packaged into a single-element tasks list) correctly includes the model field in the task dict — but somewhere between that normalization and the spawned subagent, the override is dropped or overwritten.

Also worth noting: the batch path had auth failures (401) for all models including deepseek-v4-flash, even though the same model works fine in standalone delegate_task() calls. That could be a separate credential-resolution issue in the batch path.

Let me know how I can help you with this, or if I am doing something incorrectly on my side.

@gskeyzer-web

Copy link
Copy Markdown

🔄 Rebase + extension vérifiés en production

Je confirme les observations de @impiyush. Le bug du top-level model/provider est LLM-side (le LLM omet les paramètres top-level dans le tool call JSON, pas un bug dans le handler). La syntaxe tasks=[{...}] marche parfaitement. C'est documenté dans le pitfall du skill subagent-templates.

Tests réels (production, 5+ jours)

Test Modèle Résultat
Simple task deepseek-v4-flash
Batch A deepseek-v4-flash
Batch B deepseek-v4-pro
Per-task reasoning reasoning_effort: none

Extension : reasoning_effort par tâche

J'ai étendu le PR avec un paramètre reasoning_effort optionnel (top-level + per-task) :

delegate_task(tasks=[{
    "goal": "...",
    "model": "deepseek-v4-flash",
    "reasoning_effort": "none"
}, {
    "goal": "...",
    "model": "deepseek-v4-pro",
    "reasoning_effort": "high"
}])

Valeurs supportées : none, minimal, low, medium, high, xhigh.
Propagation : delegate_task_build_child_agent(override_reasoning_effort=)AIAgent(reasoning_config=) — testé unitairement ✅

Intégration Kanban

kanban_create accepte maintenant --model et --reasoning-effort (CLI + tool). Le dispatcher passe -m model et injecte HERMES_REASONING_EFFORT dans l'env du worker.

Aide pour le merge

Le PR est rebasé proprement sur main actuel (commit a5d05cf — ~300 commits upstream, 0 conflit). Un reviewer manque (1/2) et le label P3 bloque. Que puis-je faire pour aider à débloquer ? Rebase PR ? Second reviewer ?

@impiyush

impiyush commented Jun 10, 2026

Copy link
Copy Markdown

Thank you @gskeyzer-web for confirming my results and pointing me in the right direction. Here are my updated smoke test results for this PR.

Retested after correcting my call pattern. The handler code is correct — the earlier "single-goal override not applied" was a red herring on my end.

Environment (same as before):

Results:

Test Call Model Resolved Verdict
Single-task via tasks=[{...}] model: "nemotron-3-super" (≠ default flash) nemotron-3-super Override applies correctly
Mixed-model batch Task 1: deepseek-v4-flash, Task 2: nemotron-3-super Both resolved ✅ Override works per-task
Model + structured provider model: "nemotron-3-super", provider: "ollama-cloud" nemotron-3-super Override applies correctly

Root cause of my earlier false report: The calling LLM drops optional top-level params like model/provider from the JSON tool call when they're at the top level of delegate_task(). The batch shape delegate_task(tasks=[{...}]) works because the same model reliably includes model inside dict elements. Not a code bug — a serialization behavior difference.

Overall: Code is good and does the job. tasks=[{...}] works reliably for single-task and mixed-batch delegation. The other learning is to always explicitly set the 3 properties in delegate task - goal, model and provider. I did run into an issue when the provider was not explicitly specified for Kimi-K2.6 model and hermes defaulted to using the Kimi-Coding or Moonshot AI api keys, which were not set.

This issue has been a blocker for me to run subagents correctly with different models and am glad this PR will fix it. Thanks again @jmche & @gskeyzer-web. LGTM!!

@impiyush impiyush left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to my latest comment on this PR's conversation.

#36790 (comment)

gskeyzer-web pushed a commit to gskeyzer-web/hermes-agent that referenced this pull request Jun 10, 2026
…ousResearch#36790)

Cherry-pick du commit 7d9432d de jmche:
- Ajoute les parametres model/provider a delegate_task()
- Ajoute les champs model/provider dans tasks[]
- Resolution par priorite: tasks[] > top-level > config
- Tests unitaires passes

Source: NousResearch#36790
gskeyzer-web pushed a commit to gskeyzer-web/hermes-agent that referenced this pull request Jun 10, 2026
Add reasoning_effort parameter to delegate_task() — top-level and
per-task in tasks[] array. Supported values: none, minimal, low,
medium, high, xhigh (matching parse_reasoning_effort).

Bypasses delegation.reasoning_effort config and parent inheritance
when explicitly set.

Extends PR NousResearch#36790 cherry-pick (6d6988e75 + 9583b8079).
@jmche

jmche commented Jun 11, 2026

Copy link
Copy Markdown
Author

Thanks @impiyush and @gskeyzer-web for the production validation and for narrowing down the call-shape issue.

I agree with the practical conclusion: for JSON tool calls, the most reliable shape is to put routing fields inside tasks[], even for a single delegated task:

delegate_task(tasks=[{
    "goal": "...",
    "provider": "...",
    "model": "..."
}])

To keep this shape from becoming too loose, the docs should recommend structured fields: provider as the configured provider id, and model as the provider/model-switch-resolved model name. Inline --provider can remain supported for compatibility, but JSON tool calls should prefer the structured provider field.

The handler supports top-level model / provider, but if calling models sometimes omit optional top-level fields, the docs should steer users toward the explicit tasks=[{...}] form for model/provider routing. I’m happy to add a short docs note for that if maintainers want it.

On reasoning_effort: I agree this is an important adjacent problem, especially for cross-provider delegation. A child that overrides provider / model should not blindly inherit provider-specific reasoning request shape from the parent, since that can create incompatibilities across transports. I think explicit per-task reasoning_effort is useful, but I’d prefer to keep it as a follow-up PR so this one can stay focused on model/provider routing and land cleanly.

Given the additional smoke testing and approval, I think this PR is ready from the implementation side. Maintainers: would you like me to rebase/update this branch against current main, or is the current branch sufficient once the remaining review/label requirements are cleared?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/delegate Subagent delegation type/feature New feature or request

Projects

None yet

5 participants