You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Zeroshot is an open-source (MIT) multi-agent CLI orchestrator that runs autonomous coding workflows using a planner → implementer → validator pipeline with "blind validation." Unlike single-agent tools, Zeroshot spawns independent validator agents that verify code changes by examining the codebase directly — without ever seeing the worker's internal reasoning or output history. This produces significantly more reliable code than single-agent workflows.
Hermes already has skills for Claude Code and Codex as individual agent CLIs, but uses them as single-agent tools. A Zeroshot skill would add a multi-agent orchestration layer on top of those same CLIs, giving Hermes users immediate access to battle-tested multi-agent coding workflows without waiting for native multi-agent architecture (#344).
This is a concrete, available-today capability that complements the longer-term native multi-agent work planned in #344, #356, and #406. Use zeroshot now; build native later.
Two-Tier Conductor Classification — A cheap "junior conductor" (Sonnet-class) classifies tasks into a 2D matrix of complexity (TRIVIAL/SIMPLE/STANDARD/CRITICAL) × task type (INQUIRY/TASK/DEBUG). Uncertain cases escalate to a "senior conductor" (Opus-class). This routes to the appropriate template:
Complexity
Agents
Validators
Use Case
TRIVIAL
1
0
Fix typo in README
SIMPLE
2
1 (generic)
Add dark mode toggle
STANDARD
4
2 (reqs + code)
Refactor auth system
CRITICAL
7
5 (security, testing, etc.)
Implement payment flow
SQLite Event Ledger — Every message, agent output, and state transition is persisted in a per-cluster SQLite database (~/.zeroshot/<clusterId>.db). This enables crash recovery (zeroshot resume <id>), post-mortem analysis, and token usage tracking.
Message Bus — EventEmitter-based pub/sub over the ledger. Topics include ISSUE_OPENED, PLAN_READY, IMPLEMENTATION_READY, VALIDATION_RESULT. Agent triggers subscribe to specific topics with optional JS logic conditions.
The Blind Validation Pattern:
This is the key innovation. It works through context isolation, not code isolation:
Safety hooks block user-interactive prompts and dangerous git commands.
Isolation Modes:
None (default) — Direct filesystem, manual review
Git Worktree (--worktree) — Separate branch/directory, clean PR workflow
Docker (--docker) — Full container isolation for risky or parallel tasks
Full Automation:zeroshot run 123 --ship = worktree isolation → implement → validate → create PR → auto-merge
Key Design Decisions
CLI shelling over API calls — Zeroshot delegates to existing provider CLIs rather than making API calls directly. This means auth, model selection, and tool execution are handled by each provider's own CLI. Trade-off: simpler integration but heavier process overhead.
Template-driven over code-driven — Workflow topologies are JSON configs, not code. This makes it easy to create custom workflows but limits programmatic flexibility.
Two-tier conductor — Avoids paying for expensive models on trivial tasks. The cheap classifier handles 80%+ of routing decisions.
Per-cluster SQLite — Each cluster gets its own database file, avoiding cross-contamination and enabling clean resume/cleanup.
Current State in Hermes Agent
Existing capabilities:
claude-code skill — single-agent delegation to Claude Code CLI
codex skill — single-agent delegation to Codex CLI
Relationship to existing issues: This skill is complementary, not duplicative. Existing issues propose building these patterns natively into Hermes. The Zeroshot skill provides the same capabilities TODAY by wrapping an external tool, serving as a bridge until native multi-agent support lands.
Implementation Plan
Skill vs. Tool Classification
This should be a skill (not a tool) because:
Zeroshot is an external CLI invoked via terminal — no custom Python integration needed
No API key management by Hermes — zeroshot and its providers handle their own auth
All interaction is through shell commands and text output
Fits the same pattern as the existing claude-code and codex skills
Placement: Skills Hub (not bundled). Requires Node 18+, npm, and at least one provider CLI — too specialized for the default install.
Battle-tested blind validation — 375 commits, v5.4.0, addressing the "context degradation" problem that single-agent tools suffer from. Independent validators produce more reliable results.
Complexity routing saves costs — Trivial tasks get 1 agent, critical tasks get 7. No wasted compute on simple fixes.
Full automation pipeline — zeroshot run 123 --ship takes an issue from description to merged PR with verification gates. Pairs naturally with Hermes' schedule_cronjob for batch processing.
MIT license — No licensing concerns for any integration approach.
Complements existing skills — Uses the same provider CLIs (Claude Code, Codex) that Hermes already has skills for, but adds orchestration on top.
Cons / Risks
Heavy dependency chain — 297 npm packages, including native modules (better-sqlite3, node-pty). Installation can be fragile on some systems.
Provider CLI requirement — User must have at least one provider CLI installed AND authenticated separately. Not a "just works" experience.
Node.js dependency — Hermes is Python-based. Adding a Node.js tool to the stack increases system requirements. (Node 18+ is needed.)
4 moderate npm vulnerabilities — Flagged during install. Not critical but worth monitoring.
Open Questions
Should we also create a custom zeroshot cluster template optimized for Hermes-style workflows (e.g., skill-aware validators that check Hermes conventions)?
Should the skill include guidance for using zeroshot with Hermes' schedule_cronjob to auto-process batches of GitHub issues?
Overview
Zeroshot is an open-source (MIT) multi-agent CLI orchestrator that runs autonomous coding workflows using a planner → implementer → validator pipeline with "blind validation." Unlike single-agent tools, Zeroshot spawns independent validator agents that verify code changes by examining the codebase directly — without ever seeing the worker's internal reasoning or output history. This produces significantly more reliable code than single-agent workflows.
Hermes already has skills for Claude Code and Codex as individual agent CLIs, but uses them as single-agent tools. A Zeroshot skill would add a multi-agent orchestration layer on top of those same CLIs, giving Hermes users immediate access to battle-tested multi-agent coding workflows without waiting for native multi-agent architecture (#344).
This is a concrete, available-today capability that complements the longer-term native multi-agent work planned in #344, #356, and #406. Use zeroshot now; build native later.
Research Findings
How Zeroshot Works
Architecture (63K LOC, Node.js, v5.4.0, 1,274 stars):
Zeroshot is a message-driven coordination layer built on four primitives:
Template-Driven Agent Topologies — JSON workflow definitions specify agent roles, triggers, context strategies, and hooks. Built-in templates:
single-worker,worker-validator,full-workflow,heavy-validation. Users can create custom cluster templates.Two-Tier Conductor Classification — A cheap "junior conductor" (Sonnet-class) classifies tasks into a 2D matrix of complexity (TRIVIAL/SIMPLE/STANDARD/CRITICAL) × task type (INQUIRY/TASK/DEBUG). Uncertain cases escalate to a "senior conductor" (Opus-class). This routes to the appropriate template:
SQLite Event Ledger — Every message, agent output, and state transition is persisted in a per-cluster SQLite database (
~/.zeroshot/<clusterId>.db). This enables crash recovery (zeroshot resume <id>), post-mortem analysis, and token usage tracking.Message Bus — EventEmitter-based pub/sub over the ledger. Topics include
ISSUE_OPENED,PLAN_READY,IMPLEMENTATION_READY,VALIDATION_RESULT. Agent triggers subscribe to specific topics with optional JS logic conditions.The Blind Validation Pattern:
This is the key innovation. It works through context isolation, not code isolation:
contextStrategy.sourcesconfig explicitly EXCLUDES worker output topicsProvider Abstraction:
Zeroshot shells out to existing CLI tools. Each provider implements
isAvailable(),buildCommand(), andparseEvent():claude --print --output-format json --dangerously-skip-permissionscodex exec --json --dangerously-bypass-approvals-and-sandboxgemini -p <context> --output-format stream-json --yoloSafety hooks block user-interactive prompts and dangerous git commands.
Isolation Modes:
--worktree) — Separate branch/directory, clean PR workflow--docker) — Full container isolation for risky or parallel tasksFull Automation:
zeroshot run 123 --ship= worktree isolation → implement → validate → create PR → auto-mergeKey Design Decisions
CLI shelling over API calls — Zeroshot delegates to existing provider CLIs rather than making API calls directly. This means auth, model selection, and tool execution are handled by each provider's own CLI. Trade-off: simpler integration but heavier process overhead.
Template-driven over code-driven — Workflow topologies are JSON configs, not code. This makes it easy to create custom workflows but limits programmatic flexibility.
Two-tier conductor — Avoids paying for expensive models on trivial tasks. The cheap classifier handles 80%+ of routing decisions.
Per-cluster SQLite — Each cluster gets its own database file, avoiding cross-contamination and enabling clean resume/cleanup.
Current State in Hermes Agent
Existing capabilities:
claude-codeskill — single-agent delegation to Claude Code CLIcodexskill — single-agent delegation to Codex CLIhermes-agent-spawningskill — spawn Hermes sub-processesdelegate_tasktool — spawn sub-agents within HermesWhat's missing (the gap):
Related open issues:
Relationship to existing issues: This skill is complementary, not duplicative. Existing issues propose building these patterns natively into Hermes. The Zeroshot skill provides the same capabilities TODAY by wrapping an external tool, serving as a bridge until native multi-agent support lands.
Implementation Plan
Skill vs. Tool Classification
This should be a skill (not a tool) because:
terminal— no custom Python integration neededclaude-codeandcodexskillsPlacement: Skills Hub (not bundled). Requires Node 18+, npm, and at least one provider CLI — too specialized for the default install.
Category:
autonomous-ai-agents(alongsideclaude-code,codex,hermes-agent-spawning)What We'd Need
Phased Rollout
Phase 1: Basic Skill
zeroshot run <issue>,zeroshot run "text",zeroshot run file.mdzeroshot status,zeroshot logs,zeroshot listzeroshot resume,zeroshot stop,zeroshot killPhase 2: Advanced Workflows
--prand--shipworkflowsschedule_cronjobfor batch issue processingPhase 3: Native Pattern Adoption
delegate_task(Feature: Acceptance Criteria & Independent Judge for Sub-agent Delegation (inspired by OpenPlanter) #356)Pros & Cons
Pros
zeroshot resumeafter failures. Long-running workflows survive interruptions.zeroshot run 123 --shiptakes an issue from description to merged PR with verification gates. Pairs naturally with Hermes'schedule_cronjobfor batch processing.Cons / Risks
better-sqlite3,node-pty). Installation can be fragile on some systems.orchestrator.js), 5,322-line CLI file. Code quality may limit community contributions.Open Questions
schedule_cronjobto auto-process batches of GitHub issues?References
claude-code,codex,hermes-agent-spawning