Skip to content

feat(delegate_task): per-call model/provider override (revives #3719)#25530

Open
thestark77 wants to merge 2 commits into
NousResearch:mainfrom
thestark77:feat/delegate-task-per-call-model-override
Open

feat(delegate_task): per-call model/provider override (revives #3719)#25530
thestark77 wants to merge 2 commits into
NousResearch:mainfrom
thestark77:feat/delegate-task-per-call-model-override

Conversation

@thestark77

Copy link
Copy Markdown

Summary

Adds optional model and provider parameters to delegate_task, both at the top level and per-task inside the tasks array. This unblocks model-routing plugins (per-phase models for SDD-style workflows) without forcing them to monkey-patch delegate_tool.py.

This is a fresh take on #3719 / #3794. #3794 bundled a delegation pool with per-call override; that PR was closed without merge. This PR keeps only the per-call override — the smaller, lower-risk half — so it can land and unblock downstream work. The pool can land as a follow-up.

Why

SDD-style workflows route different phases to different models for cost/quality balance: cheap models for scout/explore, mid-tier for apply, reasoning models for verify/design. delegate_task exposes no way to do that today — delegation.model/delegation.provider are global per-config.

The plumbing already exists: _build_child_agent accepts model / override_provider / override_base_url / override_api_key / override_api_mode natively. This PR wires the public schema → handler → _build_child_agent path.

Evidence

I maintain cobalt-agent, a Hermes plugin that does this routing via a source patch (apply_routing_patch.py injects _routed_* fields into task dicts before _build_child_agent is called). It has been running on two production VPS instances for ~4 weeks. 12 measurement runs, last 6 at 95% routing accuracy (logs in cobalt-agent CHANGELOG.md).

The patch works, but with Hermes shipping weekly the source-level approach is fragile — that's the whole reason for the upstream request. Once this PR lands I can retire the patch in favour of the official schema.

Diff overview

Change Where
model / provider params on delegate_task() tools/delegate_tool.py (signature + handler lambda)
model / provider properties on DELEGATE_TASK_SCHEMA top-level + tasks[].properties
_resolve_per_task_creds() helper tools/delegate_tool.py
Loop uses per-task creds instead of batch creds tools/delegate_tool.py
6 unit tests covering precedence + fallback tests/tools/test_delegate.py

Total: +192 / -7 lines.

API

Top-level (whole batch uses the same model)

{
  "name": "delegate_task",
  "arguments": {
    "goal": "Implement the auth middleware",
    "model": "glm-5.1",
    "provider": "zai"
  }
}

Per-task (different model per task in one call)

{
  "name": "delegate_task",
  "arguments": {
    "tasks": [
      {"goal": "Research competitors", "model": "glm-5",   "provider": "zai"},
      {"goal": "Write integration tests", "model": "glm-5.1"}
    ]
  }
}

Precedence

per-task model/provider > top-level model/provider > delegation.{model,provider} from config

Provider re-resolution

When provider is overridden, _resolve_per_task_creds re-runs _resolve_delegation_credentials with the new provider so base_url / api_key / api_mode match the new provider (instead of dragging the original provider's credentials). The original base_url / api_key are stripped from the override config before resolution — they belong to the old provider.

When only model is overridden, the surrounding credentials are preserved (model and provider often share the same endpoint).

Failure handling

If credential resolution fails for the overridden provider (missing API key, unknown provider, etc.), the task falls back to the configured delegation credentials rather than aborting the batch. The failure is logged at WARN level.

Backwards compatibility

The new params are optional and default to None. Behavior is identical to today when no overrides are supplied — only difference is the schema now advertises the two new properties.

Tests

TestPerTaskCredentialOverride (6 tests):

  • test_no_override_returns_default_creds
  • test_model_only_top_level_override_swaps_model_keeps_provider
  • test_per_task_model_beats_top_level
  • test_provider_override_re_resolves_full_bundle (verifies stale base_url/api_key are stripped)
  • test_per_task_provider_beats_top_level
  • test_provider_resolution_failure_falls_back_to_default

Existing TestDelegationCredentialResolution and TestDelegationProviderIntegration should keep passing unchanged.

Follow-ups (separate PRs)

Closes NousResearch#3719 (closed-without-merge; this is a fresh take with a smaller
diff than NousResearch#3794 — no delegation pool, just the per-call override that
unblocks model-routing plugins).

Adds optional ``model`` and ``provider`` parameters to ``delegate_task``:
both at the top level and per-task inside the ``tasks`` array. The pre-
existing ``_build_child_agent`` plumbing already accepted these overrides;
this PR just wires the schema → handler → ``_build_child_agent`` path.

Precedence: per-task > top-level > delegation.{model,provider} from config.

When ``provider`` is overridden, the credential bundle (base_url, api_key,
api_mode) is re-resolved through ``_resolve_delegation_credentials`` so a
different provider's endpoint and key are used. When only ``model`` is
overridden, the surrounding credentials are preserved. If resolution
fails, the batch falls back to the configured delegation credentials
rather than aborting.

Motivation: SDD-style workflows route different phases (explore, design,
apply, verify) to different models for cost/quality balance. cobalt-agent
has been monkey-patching this for ~4 weeks across two production VPS
instances (12 measurement runs, 95% routing-accuracy on the last 6) by
reading ``_routed_model``/``_routed_provider`` fields injected into task
dicts. With this PR, the patch can be retired in favour of the official
schema.

Tests: 6 new unit tests for ``_resolve_per_task_creds`` covering
no-override, model-only, per-task vs top-level precedence, provider
re-resolution, and resolution-failure fallback.

@Bartok9 Bartok9 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clean, minimal implementation of the per-call model/provider override. The scoping decision (override only, no delegation pool) is the right call — the previous PR #3794 stalled precisely because it tried to do both at once.

A few observations on the implementation:

Precedence model is clear and well-tested. The per-task > top-level > default fallback chain is explicit and covered by tests. The test_provider_resolution_failure_falls_back_to_default test is especially valuable — failure modes in credential resolution are the kind of thing that bites production users.

One edge case worth considering: When a top-level model override is provided but no provider override, _resolve_per_task_creds() swaps the model name but keeps provider from default_creds. If the top-level model name belongs to a different provider (e.g. gpt-4o with provider=anthropic in config), the task will be routed to the wrong provider. This is intentional by design (user must also specify provider to re-resolve), but might be worth a doc note in the schema description for model to avoid confusion.

Schema completeness: The tool description for model and provider in DELEGATE_TASK_SCHEMA is minimal. Would be helpful to note that provider triggers full credential re-resolution (including base_url / api_key), while model-only swaps just the model name within the existing provider bundle.

Verified on current main (0f0e20ef8): the plumbing paths targeted here (_build_child_agent accepting model/provider overrides) are present and intact. The PR applies cleanly.

Bot review on the PR flagged that model-only overrides keep the existing
provider's credential bundle. If the new model belongs to a different
provider, requests will hit the wrong endpoint. Behavior is intentional
(model and provider often share endpoints, and re-resolving on every
model swap would be wasteful), but the schema descriptions did not make
this contract visible.

Schema description for `model` now spells out: model-only swaps the
model name within the current provider; pair with `provider` to
re-resolve. Schema description for `provider` now spells out: provider
override triggers full credential re-resolution and discards the
original `base_url` / `api_key`.

Applied to both top-level and per-task variants.

No behavior change.
@thestark77

Copy link
Copy Markdown
Author

Thanks for the careful review — both points addressed in 9597c2f:

Edge case (model-only override with cross-provider model): added an explicit note to the model schema descriptions (top-level and per-task) calling out that model-only keeps the surrounding provider bundle, and that callers must also pass provider if the new model belongs to a different provider — otherwise the request hits the wrong endpoint.

Schema completeness: the model description now says "MODEL-ONLY: keeps the existing provider's credential bundle" and the provider description says "PROVIDER OVERRIDE: triggers full credential re-resolution… and discards the original base_url / api_key." Both phrasings match the actual _resolve_per_task_creds behavior so the model reading the schema knows exactly what each does.

No code or test changes — pure docs.

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/delegate Subagent delegation comp/tools Tool registry, model_tools, toolsets labels May 14, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #16163, #17756, #3172, #23649 — all implement per-call model/provider override for delegate_task. Tracking issue: #14974 (supersedes #3719). At least 6 open PRs compete for this feature; maintainer should pick one to merge and close the rest.

@discolotus

Copy link
Copy Markdown

Tracked follow-up technical debt from this PR:

@Bartok9

Bartok9 commented May 14, 2026

Copy link
Copy Markdown
Contributor

Verified 9597c2fe5. Both points addressed cleanly:

  • Edge case docs: The top-level and per-task model descriptions now explicitly call out that model-only keeps the surrounding provider bundle, and direct callers to also pass provider when the new model belongs to a different provider. Reads exactly as it should from the schema alone — no need to read _resolve_per_task_creds to know what each override does.
  • Schema completeness: "MODEL-ONLY: keeps the existing provider's credential bundle" and "PROVIDER OVERRIDE: triggers full credential re-resolution... discards the original base_url / api_key" are the right phrasings and match implementation behavior precisely.

Diff is purely descriptions; no code or test changes — confirmed.

LGTM on the docs follow-up. Thanks for the quick turnaround. 🎻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P3 Low — cosmetic, nice to have tool/delegate Subagent delegation type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants