Feature: Acceptance Criteria & Independent Judge for Sub-agent Delegation (inspired by OpenPlanter)

## Overview

[OpenPlanter](https://github.com/ShinMegamiBoson/OpenPlanter) implements a quality-gating pattern for recursive sub-agent delegation called **IMPLEMENT-THEN-VERIFY**: when a parent agent delegates work via `subtask` or `execute`, the parent specifies **acceptance criteria** (a clear statement of what "done" looks like), and upon completion, a separate **cheap "judge" model** (lowest tier, e.g. Haiku) evaluates the sub-agent's output against those criteria, returning PASS or FAIL with specific feedback. The key insight is that **implementation and verification must be uncorrelated** — the agent that does the work should not be its sole verifier.

Hermes Agent's `delegate_task` currently has no quality gating. Sub-agent results are returned to the parent as-is, with no verification of whether the output actually meets the original goal. This means the parent agent must manually evaluate every sub-agent result, consuming tokens and adding latency. Adding acceptance criteria with independent verification would significantly improve sub-agent output quality while reducing parent workload.

This is related to but distinct from #344 (Workflow Formulas), which mentions acceptance criteria in workflow step definitions but doesn't detail the independent verification mechanism. This feature is independently valuable — it applies to ALL `delegate_task` calls, not just structured workflows.

---

## Research Findings

### How OpenPlanter's Acceptance Criteria System Works

**In `engine.py` (~1011 lines), the core pattern:**

1. When dispatching a `subtask` or `execute` tool call, the parent specifies:
   - `objective`: what to accomplish
   - `acceptance_criteria`: explicit success conditions (e.g., "Output file exists and contains at least 3 entity matches with confidence > 0.7")

2. The sub-agent runs its full solve loop (potentially recursively), producing a result.

3. After completion, if acceptance criteria were provided, a **judge model** evaluates:
   - The judge is always the LOWEST TIER model (cheapest/fastest — e.g., Haiku-class)
   - The judge receives: the original objective, the acceptance criteria, and the sub-agent's final output
   - The judge returns: PASS or FAIL with specific reasoning

4. On FAIL:
   - The failure feedback is returned to the parent agent
   - The parent can decide to retry, modify the approach, or accept partial results
   - OpenPlanter does NOT auto-retry — it lets the parent decide

**Tiered model routing reinforces this:**
- Sub-agents can only delegate to EQUAL OR LOWER tier models (never escalate)
- Opus(tier 1) → Sonnet(tier 2) → Haiku(tier 3)
- The judge always uses tier 3 (cheapest), ensuring verification cost is negligible
- This prevents a cheap sub-agent from spawning an expensive verifier

**Key code patterns from `engine.py`:**
```python
# Acceptance criteria in subtask definition
subtask_tool = {
    "objective": "Cross-reference vendor names against OFAC SDN list",
    "acceptance_criteria": "Output CSV contains columns: vendor_name, sdn_match, match_score, match_type. At least one row per input vendor.",
    "model": "haiku",  # optional: request specific model tier
    "reasoning_effort": "low"  # optional: control reasoning depth
}

# After sub-agent completes, judge evaluates
judge_prompt = f"""Evaluate whether this output meets the acceptance criteria.
Objective: {objective}
Acceptance Criteria: {acceptance_criteria}
Output: {sub_agent_result}
Respond with PASS or FAIL and explain why."""
```

### The "Think" Tool with Acceptance Criteria

OpenPlanter also has a `think` tool that uses acceptance criteria for self-verification within a single agent:

```json
{
    "name": "think",
    "description": "Think through a problem step by step. Optionally specify acceptance_criteria to self-check your reasoning.",
    "parameters": {
        "thought": "string - your reasoning",
        "acceptance_criteria": "string - optional criteria to verify your reasoning against"
    }
}
```

This is lighter-weight — the agent checks its own reasoning rather than delegating to a judge. Less robust than independent verification, but useful for single-agent reasoning quality.

### Key Design Decisions

1. **Cheap judge model** — Verification should be much cheaper than implementation. Using the lowest-tier model keeps overhead minimal.
2. **No auto-retry** — The parent decides how to handle failures. This prevents infinite retry loops and preserves parent agency.
3. **Independent verification** — The implementing agent doesn't evaluate its own work. This addresses self-evaluation bias (agents tend to rate their own output favorably).
4. **Acceptance criteria are explicit** — Not just "do this well" but "output must contain X, Y, Z with these properties." This forces the parent to think about what success looks like before delegating.

---

## Current State in Hermes Agent

**`delegate_task` (tools/delegate_tool.py):**
- Spawns child AIAgent instances with isolated context
- Two modes: single task (`goal` parameter) or batch (`tasks` array, up to 3 parallel)
- Child returns a summary string to the parent
- No acceptance criteria, no verification, no quality gating
- Parent must manually evaluate output quality

**Context compression (agent/context_compressor.py):**
- Has cheap-model summarization (Gemini Flash) — shows the pattern of using a cheap model for meta-tasks already exists in the codebase

**What's missing:**
- No way to specify "what does success look like" when delegating
- No independent verification of sub-agent output
- No PASS/FAIL signal — parent gets raw output with no quality indicator
- No structured feedback on failures (what specifically was missing/wrong)

---

## Implementation Plan

### Skill vs. Tool Classification

This is a **codebase change** to `tools/delegate_tool.py` (and potentially `agent/ai_agent.py`). Not a skill — it modifies the core delegation mechanism. Not a new tool — it extends an existing one.

### What We'd Need

1. **New `acceptance_criteria` parameter on delegate_task** — Optional string describing success conditions
2. **Judge evaluation function** — Takes criteria + output, returns PASS/FAIL + reasoning
3. **Judge model selection** — Use cheapest available model (similar to how context_compressor uses Gemini Flash)
4. **Result enrichment** — Return both the sub-agent output AND the judge's verdict to the parent

### Phased Rollout

**Phase 1: Basic Acceptance Criteria + Judge**
- Add optional `acceptance_criteria` parameter to `delegate_task` (both single and batch modes)
- After sub-agent completes, if criteria provided, run a cheap-model judge
- Judge returns: `{"verdict": "PASS"|"FAIL", "reasoning": "...", "output": "..."}`
- Parent receives enriched result with the verdict
- If no criteria provided, behavior is unchanged (backward compatible)
- Use the same cheap-model approach as context_compressor (Gemini Flash or configurable)
- Deliverable: Quality-gated delegation with independent verification

**Phase 2: Think Tool for Self-Verification**
- Add a lightweight `think` tool (or extend the existing reasoning mechanism)
- Allows the agent to explicitly reason through a problem with optional self-check
- Less overhead than full delegation — useful for complex reasoning steps within a single agent
- Think tool output can optionally be hidden from the final response (scratchpad mode)
- Deliverable: Structured reasoning with optional acceptance criteria

**Phase 3: Integration with Workflow Formulas (#344)**
- Wire acceptance criteria into workflow step definitions from #344
- Each workflow step can specify acceptance criteria that must pass before downstream steps begin
- Failed steps can trigger retry logic (from #344 Phase 2) or workflow abort
- Aggregate workflow-level PASS/FAIL based on all step verdicts
- Deliverable: Quality-gated workflow orchestration

---

## Pros & Cons

### Pros
- **Directly improves sub-agent output quality** — Independent verification catches errors the implementing agent misses
- **Cheap overhead** — Judge uses lowest-tier model; verification cost is <5% of implementation cost
- **Reduces parent token consumption** — Parent gets a PASS/FAIL signal instead of having to evaluate raw output
- **Forces clearer delegation** — Writing acceptance criteria makes the parent think about what success looks like
- **Backward compatible** — acceptance_criteria is optional; existing delegate_task calls unchanged
- **Pattern already exists in codebase** — context_compressor already uses cheap models for meta-tasks
- **Proven in production** — OpenPlanter uses this successfully with recursive 4-depth delegation

### Cons / Risks
- **Additional API call per delegation** — One extra LLM call for the judge, even though it's cheap
- **False negatives** — Judge model may incorrectly FAIL valid output, causing unnecessary retries
- **False positives** — Judge model may incorrectly PASS invalid output, giving false confidence
- **Criteria quality matters** — Vague criteria ("do a good job") produce meaningless verdicts. The feature is only as good as the criteria written
- **Judge model availability** — Needs a cheap model configured; may not work if only one expensive model is available
- **Scope creep risk** — Could grow into a complex verification framework; should stay simple

---

## Open Questions

1. **Which model for the judge?** Use context_compressor's Gemini Flash approach, or let the user configure a judge model?
2. **Should the judge have access to the sub-agent's tool calls?** OpenPlanter passes only the final output. Including intermediate steps would improve judgment but increase cost.
3. **Should we auto-retry on FAIL?** OpenPlanter doesn't. But a simple `retry_on_fail=1` parameter could be useful for straightforward tasks.
4. **How to handle batch mode?** Should each task in a batch have its own acceptance criteria, or one set for the whole batch?
5. **Should the verdict be part of the summary string or a structured field?** Structured is cleaner but changes the return format.

---

## References

- [OpenPlanter engine.py](https://github.com/ShinMegamiBoson/OpenPlanter/blob/main/agent/engine.py) — Recursive engine with acceptance criteria (~1011 lines)
- [OpenPlanter prompts.py](https://github.com/ShinMegamiBoson/OpenPlanter/blob/main/agent/prompts.py) — IMPLEMENT-THEN-VERIFY pattern description
- [OpenPlanter tool_defs.py](https://github.com/ShinMegamiBoson/OpenPlanter/blob/main/agent/tool_defs.py) — Acceptance criteria in subtask/execute schemas
- Hermes `delegate_tool.py` — Current delegation implementation
- Hermes `context_compressor.py` — Existing cheap-model meta-task pattern
- #344 — Workflow Formulas (complementary; Phase 3 integration)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Acceptance Criteria & Independent Judge for Sub-agent Delegation (inspired by OpenPlanter) #356

Overview

Research Findings

How OpenPlanter's Acceptance Criteria System Works

The "Think" Tool with Acceptance Criteria

Key Design Decisions

Current State in Hermes Agent

Implementation Plan

Skill vs. Tool Classification

What We'd Need

Phased Rollout

Pros & Cons

Pros

Cons / Risks

Open Questions

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature: Acceptance Criteria & Independent Judge for Sub-agent Delegation (inspired by OpenPlanter) #356

Description

Overview

Research Findings

How OpenPlanter's Acceptance Criteria System Works

The "Think" Tool with Acceptance Criteria

Key Design Decisions

Current State in Hermes Agent

Implementation Plan

Skill vs. Tool Classification

What We'd Need

Phased Rollout

Pros & Cons

Pros

Cons / Risks

Open Questions

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions