Skip to content

Feature: Inception Prompting — Hardened Sub-Agent Prompts Against Delegation Failures (inspired by CAMEL-AI) #375

@teknium1

Description

@teknium1

Overview

CAMEL-AI's original research contribution (NeurIPS 2023, arXiv:2303.17760) includes Inception Prompting — a systematic set of prompt techniques designed to prevent common failure modes when one LLM agent delegates work to another. These failure modes are real problems in Hermes Agent's delegate_task today, and will become even more critical when multi-agent orchestration (#344) is built.

This is partially applicable right now (improving delegate_tool.py's prompt construction), and becomes a prerequisite for reliable multi-agent workflows.

Prerequisite for full value: #344 (Workflow Formulas + Multi-Agent Orchestration)
Related: #356 (Acceptance Criteria), #374 (Local Browser Backend — same Eigent/CAMEL research thread)


Research Findings

The Four Failure Modes CAMEL Identified

CAMEL's research systematically catalogued how LLM agents fail when communicating with each other. These map directly to problems observable in Hermes delegate_task:

1. Role-Flipping
The sub-agent stops doing the assigned work and starts acting like the parent — asking questions, requesting clarification, or delegating back.

In Hermes today: A delegated sub-agent sometimes responds with "I'd recommend you do X" or "What would you like me to focus on?" instead of actually doing the work. The sub-agent has no one to ask — it's isolated — so these responses are pure waste.

2. Instruction Echoing
The sub-agent restates the task in different words without actually performing it. The summary looks like it did something, but it just paraphrased the goal.

In Hermes today: Sub-agent returns "I analyzed the codebase for security issues and identified several areas of concern" without listing any actual issues or making any tool calls.

3. Flake Replies
The sub-agent gives vague, non-committal responses. "It seems fine," "There might be issues," "Further investigation needed."

In Hermes today: This is the most common delegation failure. The parent asks for a concrete deliverable and gets a wishy-washy summary.

4. Infinite Loops
The sub-agent gets stuck retrying the same broken approach, or cycles between two states without converging.

In Hermes today: A sub-agent hitting the same error repeatedly, trying the same fix each time until it runs out of iterations. The max_iterations parameter is the only guard, and it's a blunt instrument.

CAMEL's Inception Prompting Solution

CAMEL prevents these failures by injecting specific prompt components:

  1. Role anchoring — Explicit instructions that the agent MUST stay in its assigned role and never flip to the requester role
  2. Output format enforcement — Requiring concrete, actionable outputs rather than commentary
  3. Completion signaling — Clear protocol for when the agent is done vs needs to continue
  4. Anti-echo directives — "Do not restate the task. Perform it."
  5. Convergence pressure — Instructions that push toward completing the task rather than expanding scope

Current State in Hermes Agent

delegate_tool.py_build_child_system_prompt():
The current sub-agent system prompt is built from:

  • The base system prompt (same as parent, trimmed)
  • The goal/context provided by the parent
  • Basic instructions about available tools

What's missing:

  • No explicit role-anchoring ("you are a worker, not a coordinator")
  • No anti-echo directives
  • No output format requirements (concrete deliverables, not commentary)
  • No convergence pressure (finish the task, don't expand scope)
  • No stuck-detection prompting (if approach fails twice, try something different)

Implementation Plan

Skill vs. Tool Classification

This is a codebase change to tools/delegate_tool.py, specifically the system prompt construction. Not a skill or new tool.

Phased Rollout

Phase 1: Prompt Hardening for Current delegate_task (No dependencies)

  • Modify _build_child_system_prompt() to include inception prompting techniques
  • Add role-anchoring: "You are executing a delegated task. Do the work directly — do not ask questions, request clarification, or suggest the requester do it instead."
  • Add anti-echo: "Do not restate or paraphrase the task. Perform it using your tools and report concrete results."
  • Add output format guidance: "Your response must include specific findings, file paths, code snippets, or other concrete artifacts. Vague summaries like 'it seems fine' or 'further investigation needed' are not acceptable."
  • Add convergence pressure: "If an approach fails twice, try a fundamentally different approach rather than retrying the same thing."
  • Deliverable: Measurably better sub-agent output quality, zero architectural changes

Phase 2: Enhanced Guard Rails for Multi-Agent Workflows (Depends on #344)

Phase 3: Adaptive Prompting (Optional)

  • Track delegation success/failure rates per prompt configuration
  • A/B test different inception prompting strategies
  • Auto-tune prompt components based on observed failure modes
  • Deliverable: Self-improving delegation quality

Pros & Cons

Pros

  • Phase 1 is immediately actionable — No dependencies, just prompt changes to delegate_tool.py
  • Addresses real observed failures — Role-flipping and flake replies are common in current delegation
  • Zero cost — No additional API calls, no new dependencies, just better prompts
  • Research-backed — CAMEL's NeurIPS 2023 paper validated these techniques across thousands of multi-agent conversations
  • Foundation for Feature: Multi-Agent Architecture — Orchestration, Cooperation, Specialized Roles & Resilient Workflows #344 — When multi-agent orchestration is built, these patterns are already in place

Cons / Risks

  • Prompt length increase — More system prompt text consumes context window. Should be minimal (~200 tokens).
  • Over-constraining — Too-rigid prompts might prevent creative problem-solving by sub-agents
  • Hard to measure — Delegation quality improvement is subjective and hard to benchmark without a test suite
  • Model-dependent — Different LLMs respond differently to prompt instructions. What works for Claude may not work for GPT.

Open Questions

  1. How to measure improvement? Need a set of delegation test cases with known-good outputs to compare before/after.
  2. Should prompt hardening be configurable? Some users might want looser sub-agents for creative tasks.
  3. How aggressive should anti-loop detection be? Killing a loop too early might abort legitimate retry logic.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions