Summary
Extends the per-task model parameter from #7586 with named tiers and a model discovery tool, so the agent can make informed delegation decisions without knowing deployment-specific model identifiers.
Motivation
With #7586, the agent can pass model="google/gemini-flash-2.0" to delegate_task. But:
- The agent needs to know exact model names for the deployment
- Different users have different models available
- The agent has no way to discover what's available at runtime
This is the complementary piece: tiers abstract model names, and list_models lets the agent discover what's available.
Proposed Changes
1. Model Tiers (delegation.model_tiers)
delegation:
model_tiers:
small: google/gemini-flash-2.0 # fast/cheap
# medium: omit to inherit parent model
large: anthropic/claude-opus-4-6 # complex reasoning
The agent can then say:
delegate_task(goal="List all .py files", model="small") # → gemini flash
delegate_task(goal="Review this PR for security", model="large") # → opus
delegate_task(goal="Debug the auth flow") # → parent model (medium)
Tier names resolve to configured model names. Unknown tiers fall back to the parent model.
2. list_models Tool
Returns available models with tier assignments, context lengths, and providers:
{
"models": [
{"name": "qwen35-397b", "provider": "litellm", "context_length": 524288, "is_default": true},
{"name": "gemma4-nothink", "provider": "litellm", "context_length": 262144, "tier": "small"},
{"name": "claude-sonnet-4-6", "provider": "litellm", "context_length": 1000000, "tier": "large"}
],
"tiers": {"small": "gemma4-nothink", "large": "claude-sonnet-4-6"},
"default_model": "qwen35-397b"
}
Registered in the delegation toolset alongside delegate_task.
Use Cases
- Explorer subagent:
delegate_task(goal="Find all files related to auth", model="small") — fast file discovery on a cheap model
- Peer review:
delegate_task(goal="Review my implementation", model="large") — escalate to a stronger model
- Mixed batch: different model per task based on complexity
- Self-awareness: agent calls
list_models to discover what's available before deciding
Why This Is Better Than smart_model_routing
We deployed and tested smart_model_routing (message-length heuristic) in production and disabled it (see #7905). Short messages like "yes" or "go ahead" can trigger the most complex operations — message length is a terrible proxy for task complexity.
Model-directed delegation is fundamentally better: the model has full context, knows the task complexity, and chooses the right model for each subagent. No heuristics, no false positives.
Relationship to Other PRs
Implementation
We have a working implementation on our fork. Happy to submit a PR once #7586 lands, or rebase on top of it.
Credit: Per-task model parameter design from @Labhund (#7586). Tier + list_models additions from our production deployment.
Summary
Extends the per-task model parameter from #7586 with named tiers and a model discovery tool, so the agent can make informed delegation decisions without knowing deployment-specific model identifiers.
Motivation
With #7586, the agent can pass
model="google/gemini-flash-2.0"todelegate_task. But:This is the complementary piece: tiers abstract model names, and list_models lets the agent discover what's available.
Proposed Changes
1. Model Tiers (
delegation.model_tiers)The agent can then say:
Tier names resolve to configured model names. Unknown tiers fall back to the parent model.
2.
list_modelsToolReturns available models with tier assignments, context lengths, and providers:
{ "models": [ {"name": "qwen35-397b", "provider": "litellm", "context_length": 524288, "is_default": true}, {"name": "gemma4-nothink", "provider": "litellm", "context_length": 262144, "tier": "small"}, {"name": "claude-sonnet-4-6", "provider": "litellm", "context_length": 1000000, "tier": "large"} ], "tiers": {"small": "gemma4-nothink", "large": "claude-sonnet-4-6"}, "default_model": "qwen35-397b" }Registered in the
delegationtoolset alongsidedelegate_task.Use Cases
delegate_task(goal="Find all files related to auth", model="small")— fast file discovery on a cheap modeldelegate_task(goal="Review my implementation", model="large")— escalate to a stronger modellist_modelsto discover what's available before decidingWhy This Is Better Than smart_model_routing
We deployed and tested
smart_model_routing(message-length heuristic) in production and disabled it (see #7905). Short messages like "yes" or "go ahead" can trigger the most complex operations — message length is a terrible proxy for task complexity.Model-directed delegation is fundamentally better: the model has full context, knows the task complexity, and chooses the right model for each subagent. No heuristics, no false positives.
Relationship to Other PRs
is_local_endpoint+ smart routing concerns.Implementation
We have a working implementation on our fork. Happy to submit a PR once #7586 lands, or rebase on top of it.
Credit: Per-task model parameter design from @Labhund (#7586). Tier + list_models additions from our production deployment.