Skip to content

fix(delegation): accept per-call model override on delegate_task#23769

Open
Tranquil-Flow wants to merge 2 commits into
NousResearch:mainfrom
Tranquil-Flow:fix/23467-delegate-task-model-arg
Open

fix(delegation): accept per-call model override on delegate_task#23769
Tranquil-Flow wants to merge 2 commits into
NousResearch:mainfrom
Tranquil-Flow:fix/23467-delegate-task-model-arg

Conversation

@Tranquil-Flow

@Tranquil-Flow Tranquil-Flow commented May 11, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes a silent-drop bug in subagent delegation: the delegate_task tool accepted no model kwarg and the schema did not expose a model field, so a model emitting delegate_task(model="claude-sonnet-4.6", goal="...") had its override silently discarded at the dispatch boundary — the child inherited the parent's model regardless.

delegation.model from config.yaml was already plumbed through _resolve_delegation_credentials (the reporter's "silently ineffective" claim for that path was stale); the gap was only at the per-call surface.

Adds a top-level model parameter plus a per-task model field with the precedence:

per-task model > top-level model > delegation.model (from config)

None at any layer falls through to the next, so existing config-only deployments are unaffected.

For providers whose transport selection depends on target_model (Azure Foundry chat-completions-vs-responses by model lineage, openai-codex codex_responses-vs-chat_completions per slug, OpenCode routes), the task loop re-resolves credentials per task when task_model differs from the model that produced the outer creds. The common single-model path stays single-call.

Related Issue

Closes #23467

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

tools/delegate_tool.py:

  • delegate_task() signature accepts model: Optional[str] = None
  • Single-task path embeds model in the synthesized task dict so per-task precedence applies whether the caller used goal= or tasks=[...]
  • Task loop computes task_model with per-task > top-level > delegation.model precedence and re-resolves credentials when task_model != creds["model"]
  • _resolve_delegation_credentials gains an override_model parameter that wins over cfg["model"] for target_model computation
  • Top-level schema advertises model; per-task schema advertises model
  • Registry handler lambda forwards args.get("model")

_build_child_agent already accepted a model parameter and resolved it via model or parent_agent.model, so no changes needed there.

How to Test

scripts/run_tests.sh tests/tools/test_delegate.py -q

134/134 pass, including 7 new tests under TestDelegateModelArg:

  • Schema audits for top-level + per-task model fields
  • Registry dispatch forwards args.get("model") (catches schema-without-handler drift, the original failure mode)
  • Top-level model= reaches _build_child_agent for every task
  • Per-task model wins over top-level for that task only
  • Per-call model drives runtime resolution (Azure-Foundry-style transport selection by target_model)
  • delegation.model config path still flows through when no per-call value is given (regression guard)

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 15.6

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — done (DELEGATE_TASK_SCHEMA)

Screenshots / Logs

scripts/run_tests.sh tests/tools/test_delegate.py -q
134 passed

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/delegate Subagent delegation duplicate This issue or pull request already exists labels May 11, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #23649 — same per-call model override fix for delegate_task, filed earlier today. Part of a saturated feature cluster (5+ competing PRs for this feature).

Soju06 pushed a commit to Soju06/hermes-agent that referenced this pull request May 22, 2026
Remove delegate-per-task-model from the active runtime patch manifest.
The live configuration does not route subagents with per-call model overrides,
and upstream PR NousResearch#23769 remains open with failing checks and duplicate overlap.

Origin: local-author
Upstream-PR: none
Patch-State: local-only
The `delegate_task` tool schema and Python signature did not accept a
`model` parameter, so a model emitting
`delegate_task(model="...", goal="...")` had the override silently
dropped at the dispatch boundary — the child inherited the parent's
model regardless. `delegation.model` from `config.yaml` was already
plumbed through via `_resolve_delegation_credentials` (creds["model"]);
the gap was only at the per-call surface.

Adds a top-level `model` parameter plus a per-task `model` field with
the precedence:

  per-task model > top-level model > delegation.model (from config)

`None` at any layer falls through to the next, so existing config-only
deployments are unaffected.

Six surgical changes:

  * `delegate_task()` signature accepts `model: Optional[str] = None`
  * Single-task path embeds `model` in the synthesized task dict so the
    same precedence rule applies whether the caller used `goal=` or
    `tasks=[...]`
  * `_build_child_agent` call site replaces the bare `creds["model"]`
    with the layered `task_model` resolution
  * Top-level schema advertises `model`
  * Per-task schema advertises `model`
  * Registry handler lambda forwards `args.get("model")`

`_build_child_agent` already accepted a `model` parameter and resolved
it via `model or parent_agent.model`, so no changes needed there.

Tests
-----

Six new tests under `TestDelegateModelArg`:

  * `test_schema_advertises_top_level_model` / `_per_task_model` —
    schema audits the new fields exist with `type: string`
  * `test_registry_handler_forwards_model_arg` — dispatching via the
    registry passes `model` through (catches schema-without-handler
    drift, the original failure mode)
  * `test_top_level_model_reaches_child_build` — top-level `model=`
    arrives at `_build_child_agent` (fails on current main)
  * `test_per_task_model_wins_over_top_level` — per-task override
    beats top-level for that task only (fails on current main)
  * `test_delegation_config_model_used_when_no_per_call_override` —
    regression guard: existing `delegation.model` config path still
    flows through when no per-call value is given

All 133 tests in `tests/tools/test_delegate.py` pass.
Round-1 reviewer caught: ``_resolve_delegation_credentials`` was called
once before the task loop using ``delegation.model`` from config, so
``api_mode`` / ``base_url`` were resolved against the configured default
model. For providers whose transport depends on ``target_model`` —
Azure Foundry (chat-completions vs responses by model lineage),
openai-codex (codex_responses vs chat_completions per slug), and
OpenCode — a call like ``delegate_task(model="gpt-5.3-codex")`` would
inherit the wrong transport and the child would run the requested model
over the wrong api_mode.

Adds an ``override_model`` parameter to ``_resolve_delegation_credentials``
that takes precedence over ``cfg["model"]`` when computing the
``target_model`` passed into ``resolve_runtime_provider``. The task loop
re-resolves credentials per task when ``task_model`` differs from the
model that produced the outer ``creds``; otherwise the outer ``creds``
are reused so the common single-model path stays single-call.

Regression test
---------------

``test_per_call_model_drives_runtime_resolution`` simulates an
Azure-Foundry-style resolver that flips ``api_mode`` based on whether
the target_model contains ``codex``. With ``delegation.model=gpt-5`` in
config and ``delegate_task(model="gpt-5-codex")`` at the call site, the
child must receive ``override_api_mode="responses"`` (resolved against
the per-call model), not ``chat_completions`` (resolved against the
config model). Test verified to fail on the round-1 code path
(``chat_completions != responses``) and pass on the round-2 fix.

All 134 tests in ``tests/tools/test_delegate.py`` pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists tool/delegate Subagent delegation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

delegate_task model parameter silently discarded — subagents always inherit parent model

2 participants