feat(delegate): add model_hint parameter for per-task model routing by huu951008-gif · Pull Request #37966 · NousResearch/hermes-agent

huu951008-gif · 2026-06-03T06:36:16Z

Summary

Three OMO-inspired improvements bundled for atomic review:

delegate_task model_hint parameter (OMO 11.3, P0) — let parent route child to different model using short names like haiku / opus or full IDs.
hermes_orchestrator boulder + ultragoal (OMO 11.2, P1) — persistent evidence chain for every delegated task.
Task-type keyword routing (OMO 11.1, P1) — auto-inject system-prompt templates based on goal keywords like /research, /critic, etc.

Why

Borrowed from Sisyphus Labs' OMO tool (lazyclaudecode / oh-my-openagent) — addresses three real gaps in delegate_task:

No model layering — all subagents inherit parent's model; can't cheaply route "explore this dir" to haiku
No persistence — long-task intermediate state is lost; no way to resume or audit
No orchestration templates — every task needs a hand-written system prompt; no reusable patterns

Changes

tools/delegate_tool.py — add model_hint parameter + task-type keyword detection (~+200 lines)
hermes_orchestrator.py (new) — OrchestratorContext + boulder + ultragoal + audit (590 lines)
hermes_task_templates.py (new) — keyword regex + 6 task-type templates (320 lines)
tests/test_hermes_orchestrator.py (new) — 22 unit tests
tests/test_hermes_task_templates.py (new) — 27 unit tests
tests/tools/test_delegate_model_hint.py (new) — 20 unit tests

Test Results

$ pytest tests/test_hermes_task_templates.py \
        tests/test_hermes_orchestrator.py \
        tests/tools/test_delegate_model_hint.py \
        tests/tools/test_delegate.py
204 passed, 1 warning in 18.68s

Plus full delegate regression: 177 passed (was 155, +27 new templates +22 orchestrator = 204).

Backward Compatibility

✅ All new features are opt-in:

model_hint=None → identical behavior to before
hermes_orchestrator.py is a separate module (not auto-imported in delegate_task)
hermes_task_templates.py is wrapped in try/except ImportError — if missing, delegate_task works as before

Usage Examples

1. model_hint (OMO 11.3)

# Parent on opus, child on haiku (1/20 cost)
delegate_task(
    goal="扫一下 ~/.hermes/skills/ 找 contract",
    model_hint="haiku",
    toolsets=["file", "terminal"],
)

# Batch 混搭
delegate_task(tasks=[
    {"goal": "读 README 总结", "model_hint": "haiku"},
    {"goal": "对比方案给主推荐", "model_hint": "opus"},
])

2. boulder + ultragoal (OMO 11.2)

from hermes_orchestrator import OrchestratorContext

with OrchestratorContext(
    "扫目录找 contract",
    model="claude-haiku-4-5",
    success_criteria=["找到所有 contract 相关 skill 路径"],
    anti_criteria=["不要修改文件"],
) as ctx:
    ctx.append_step(tool="terminal", args_summary="ls ~/.hermes/skills")
    ctx.complete_step(result_summary="找到 2 个文件")
    ctx.record_evidence(
        source="工具输出",
        content="ls 返回 contract-review/, contract-review-intent/",
        citation="terminal: ls ~/.hermes/skills/",
    )
# audit.md 自动生成 at ~/.hermes/orchestrator/ultragoal/{task_id}/audit.md

The source field (一手/二手/推断/工具输出/用户输入) addresses the Mavis paper's "二手 source passed as firsthand" failure mode — different source types get distinct emoji markers in audit.md so reviewers can spot goal drift at a glance.

3. Task-type keyword routing (OMO 11.1)

# 自动检测触发词 → 注入模板 + 推荐 model
delegate_task(goal="/research 当前 AI agent 生态")        # → sonnet + 调研模板
delegate_task(goal="/critic 审查这个方案")                  # → opus + Momus 7维挑刺模板
delegate_task(goal="扫一下 skills/ 找 contract")           # → haiku + 只读侦察模板
delegate_task(goal="/implement 做一个 CLI 工具")           # → opus + 5步实施模板

Caller-supplied model_hint always wins:

delegate_task(goal="/critic 方案", model_hint="haiku")  # haiku 优先

Supported Keywords

英文	中文	模板	默认 model
/research	调研	research (4-step 多源调研)	sonnet
/implement	实施	implement (5-step 计划+回归)	opus
/review	审查	review (多视角+必给结论)	opus
/critic	挑刺	critic (Momus 7维挑刺)	opus
/workflow	编排	workflow (多 agent 编排)	sonnet
/explore	扫一下	explore (只读快速侦察)	haiku

Thread Safety

hermes_orchestrator: threading.Lock() around all file ops, atomic writes (.tmp + rename)
hermes_task_templates: pure functions, no I/O
delegate_task: model_hint additions are pure value-passing, no shared state

Borrowed from Sisyphus Labs' OMO tool (lazyclaudecode) — model layering pattern: opus for decisions, haiku for reconnaissance. Lets parents route children to a different model than their own, with short names like "haiku"/"opus"/"sonnet" or full model ids. - New `_resolve_model_hint()` helper + `_is_full_model_id()` detector - `delegate_task()` signature: add `model_hint: Optional[str] = None` - Top-level + per-task resolution (per-task wins) - Per-task model_hint beats top-level model_hint - Backward compatible: defaults to None, 135 existing delegate tests pass Tests: 20 new (TestIsFullModelId / TestResolveModelHint / TestDelegateTaskSignature / TestDelegateTaskModelResolution) + 135 original delegate tests = 155 passed. See multi-agent-harness/SKILL.md section 11.3 (OMO P0 roadmap). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…11.2) Borrowed from Sisyphus Labs' OMO tool ultragoal + start-work-continuation mechanism. Solves the problem of lost intermediate results after delegate_task completes — supports long-task resume and evidence audit. Key design: - ~/.hermes/orchestrator/boulder/{task_id}.json — task progress (steps, status, breakpoints) - ~/.hermes/orchestrator/ultragoal/{task_id}/ — goal + embedded success/anti criteria - ultragoal/.../evidence/step_NNN.json — per-step evidence with source tag - ultragoal/.../audit.md — human-readable audit report The evidence source field (一手/二手/推断/工具输出/用户输入) is the key innovation that addresses Mavis' 'goal drift' / '二手 source passed as firsthand' failure mode — different source types get distinct emoji markers in audit.md so reviewers can spot drift at a glance. Public API: - OrchestratorContext (context manager — auto-finish on exit/exception) - make_task_id / create_boulder / load_boulder / save_boulder - append_step / complete_step / finish_boulder / list_boulders - set_goal / load_goal / record_evidence / write_audit Thread-safe (uses lock around all file ops). Atomic writes (.tmp + rename) to prevent half-written state. Tests: 22 unit tests covering CRUD, steps, ultragoal, audit, context manager (auto-finish on success and exception), list_boulders filters. See multi-agent-harness/SKILL.md section 11.2 (OMO P1 roadmap). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Borrowed from Sisyphus Labs' OMO ultrawork keyword mechanism: detect task-type keywords in the goal string and auto-inject the corresponding system-prompt template + recommended model_hint. Supported keywords (English + Chinese): /research / 调研 → research template + sonnet /implement / 实施 → implement template + opus /review / 审查 → review template + opus /critic / 挑刺 → critic template (Momus) + opus /workflow / 编排 → workflow template + sonnet /explore / 扫一下 → explore template (read-only) + haiku Caller-supplied model_hint always wins (overrides keyword recommendation). Templates are opt-in: if no keyword is detected, delegate_task behaves identically to before. Regex strategy: - English triggers: \b word boundary - Chinese triggers: no lookbehind/lookahead (汉字 adjacency causes false positives like "调研一" being misread as "调研"+boundary) - Length-descending matching (longer prefix wins) Public API: detect_task_type(goal) -> Optional[(template_key, model_hint, desc)] inject_template(goal, task_type=..., context=...) -> str Tests: 27 unit tests covering keyword detection (English + Chinese + edge cases), template injection, integration with delegate_task (auto model_hint set, caller override wins). See multi-agent-harness/SKILL.md section 11.1 (OMO P1 roadmap). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Allow individual tasks in the array to specify their own and/or , overriding the global delegation config for that task only. When a per-task override is present, is called with a task-scoped config so that base_url, api_key, and api_mode are derived correctly from the per-task provider — not the global delegation config. Tasks without overrides fall back to the pre-resolved dict as before (no regression for existing usage). Changes: - : add and fields to the per-task object inside array - loop: resolve per-task credentials when override is present; otherwise reuse global (zero overhead) Closes NousResearch#35437 Related: NousResearch#34489, NousResearch#31537, NousResearch#36790, NousResearch#30388, NousResearch#37966

Hermes OMO Contributor and others added 2 commits June 3, 2026 14:32

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/delegate Subagent delegation labels Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(delegate): add model_hint parameter for per-task model routing#37966

feat(delegate): add model_hint parameter for per-task model routing#37966
huu951008-gif wants to merge 3 commits into
NousResearch:mainfrom
huu951008-gif:feat/delegate-model-hint

huu951008-gif commented Jun 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

huu951008-gif commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Changes

Test Results

Backward Compatibility

Usage Examples

1. model_hint (OMO 11.3)

2. boulder + ultragoal (OMO 11.2)

3. Task-type keyword routing (OMO 11.1)

Supported Keywords

Thread Safety

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

huu951008-gif commented Jun 3, 2026 •

edited

Loading