Skip to content

Feature: Symphony-Style Autonomous Issue Resolution — Poll-Dispatch-Resolve-Land Workflow (inspired by OpenAI Symphony) #404

@teknium1

Description

@teknium1

Overview

OpenAI Symphony (Apache-2.0, released March 4 2026) is a daemon service that transforms project management into autonomous coding runs. It polls an issue tracker (Linear) for work items, spawns isolated coding agents per issue, manages the full lifecycle (Todo → In Progress → Human Review → Merging → Done), and provides "proof of work" before landing code. Teams manage work rather than supervising agents.

This issue proposes adapting Symphony's core orchestration pattern into a Hermes Agent skill: autonomous issue resolution driven by GitHub Issues (with optional Linear support). Hermes would poll a repo's issue tracker on a schedule, claim eligible issues, create isolated workspaces, work through each issue autonomously using a per-repo WORKFLOW.md configuration, and hand off with comprehensive proof of work (CI green, PR reviewed, workpad updated).

This is independently valuable from #344 (Multi-Agent Architecture). Symphony is single-agent-per-issue — each issue gets one agent working in isolation. The orchestration layer (polling, dispatching, lifecycle management) is orthogonal to multi-agent patterns. However, Phase 3+ could leverage #344's workflow DAG engine for complex issues that need decomposition.


Research Findings

How Symphony Works

Symphony is a six-layer architecture designed for portability:

  1. Policy Layer — A repo-owned WORKFLOW.md file (YAML frontmatter for config, Markdown body for the agent prompt template). This is the primary team interface: version-controlled, dynamically reloaded, defines everything from polling cadence to agent behavior.

  2. Configuration Layer — Typed getters for tracker (kind, project_slug, API key), polling (interval_ms), workspace (root path), hooks (shell scripts for lifecycle events), agent (max_concurrent, max_turns, retry backoff), and codex (command, sandbox, approval policies).

  3. Coordination Layer (Orchestrator) — A GenServer (Elixir/OTP) that owns the poll loop and all scheduling state. Each issue transitions through: Unclaimed → Claimed → Running → RetryQueued → Released. The orchestrator handles:

    • Priority-based dispatch (priority ascending, oldest first, with per-state concurrency limits)
    • Blocker detection (Todo issues with non-terminal blockers are held)
    • Stall detection (kills agents inactive for >5min)
    • Active run reconciliation (terminates agents whose issues moved to terminal states)
    • Exponential backoff retries: delay = min(10000 * 2^(attempt-1), max_backoff)
    • Continuation retries: 1s delay after clean exit to re-check if issue is still active
  4. Execution Layer — Per-issue workspace directories with lifecycle hooks (after_create, before_run, after_run, before_remove). Hooks are shell scripts run in the workspace context with configurable timeout. Safety invariants enforce CWD validation and root containment.

  5. Integration Layer — Issue tracker adapters. Currently Linear only, but the spec is designed for pluggable tracker kinds. Issues are normalized into a stable model (id, identifier, title, description, priority, state, branch_name, labels, blocked_by).

  6. Observability Layer — Structured logs + optional HTTP dashboard (/api/v1/state, /api/v1/refresh) for monitoring concurrent agent runs, token usage, and rate limits.

The WORKFLOW.md Contract

The most transferable pattern. A single file that defines:

---
tracker:
  kind: linear
  project_slug: "my-project"
  active_states: [Todo, In Progress, Merging, Rework]
  terminal_states: [Closed, Cancelled, Done]
polling:
  interval_ms: 30000
workspace:
  root: ~/code/workspaces
hooks:
  after_create: |
    git clone --depth 1 git@github.com:org/repo.git .
agent:
  max_concurrent_agents: 10
  max_turns: 20
---

You are working on ticket {{ issue.identifier }}
Title: {{ issue.title }}
Description: {{ issue.description }}

The Markdown body is a Liquid-compatible template with issue.* and attempt variables. Unknown variables/filters fail rendering (strict mode). Empty prompt body falls back to a minimal default. The file is hot-reloaded on change.

The Workpad Pattern

Symphony's WORKFLOW.md defines a structured "Codex Workpad" — a single persistent comment on the issue tracker that serves as the agent's scratchpad:

## Codex Workpad

\`\`\`text
devbox-01:/home/dev-user/code/workspaces/MT-32@7bdde33bc
\`\`\`

### Plan
- [ ] 1. Parent task
  - [ ] 1.1 Child task

### Acceptance Criteria
- [ ] Criterion 1

### Validation
- [ ] targeted tests: \`make test\`

### Notes
- short progress note with timestamp

### Confusions
- unclear aspects during execution

This comment is the single source of truth for progress. All updates go here — no separate completion comments. The agent updates it after every meaningful milestone.

Skill Composition Pattern

Symphony's .codex/skills/ directory implements a composable workflow:

  • commit: Conventional commits with Co-authored-by trailers, session-context-driven messages
  • push: Push + PR creation, delegates to "pull" on non-fast-forward
  • pull: Merge origin/main with zdiff3 conflict resolution, git rerere for conflict memory
  • land: Shepherd PR through review → CI → squash-merge, with a 621-line async Python watcher (land_watch.py) that monitors reviews, CI checks, and head SHA changes in parallel
  • linear: GraphQL operations reference guide
  • debug: Log correlation and failure triage

The land skill is particularly sophisticated — it implements a full review feedback protocol with per-comment handling (accept, clarify, or pushback with [codex] prefix), async parallel monitoring of reviews + CI + head changes, and exponential backoff with jitter for rate limiting.

Key Design Decisions

  1. Scheduler, not executor — Symphony reads trackers and runs agents but doesn't write to tickets directly. The agent does ticket writes via tools. Clean separation of concerns.
  2. In-repo policy — WORKFLOW.md is version-controlled, so workflow changes are PRs, not config changes.
  3. Workspace persistence — Workspaces survive across runs for the same issue, enabling continuation.
  4. Status-driven routing — Agent behavior is determined by issue state (FSM), not arbitrary branching.
  5. Fail immediately on user input — If an agent requests human input, the run fails. Symphony is for full automation.
  6. Continuation turns — After a successful turn, the agent checks if the issue is still active and continues on the same thread (up to max_turns), avoiding cold-start overhead.

Current State in Hermes Agent

What we already have that's relevant:

  • Cronjob system (cron/scheduler.py, schedule_cronjob tool) — Can poll on a schedule. Already supports recurring intervals, cron expressions, and delivery targets. This IS our polling mechanism.
  • Sub-agent delegation (tools/delegate_tool.py) — Can spawn isolated child agents with restricted toolsets and separate terminal sessions. Batch mode supports up to 3 parallel tasks.
  • Hermes agent spawning (skill: hermes-agent-spawning) — Can spawn full Hermes instances as subprocesses for independent long-running tasks.
  • GitHub workflow skills (github-issues, github-pr-workflow, github-code-review) — Already know how to create/manage issues, PRs, and code reviews via gh CLI.
  • Skills system — Our existing skills are already structured similarly to Symphony's .codex/skills/.
  • Memory/session system — Hermes has persistent memory across sessions, useful for tracking orchestration state.

What's missing (the gap):

  1. No automated issue claiming/dispatch — Hermes doesn't monitor issue trackers and auto-assign itself work. Everything is user-initiated.
  2. No WORKFLOW.md convention — No per-repo configuration that tells Hermes how to behave on that repo's issues.
  3. No workspace lifecycle hooks — No structured before_run/after_run/after_create hook system.
  4. No workpad pattern — No convention for maintaining a persistent progress comment on issues.
  5. No status-driven routing — No FSM for routing behavior based on issue state.
  6. No proof-of-work protocol — No structured verification before handing off (CI check, review sweep, acceptance criteria validation).

Existing issues with partial overlap:


Implementation Plan

Skill vs. Tool Classification

This should be a skill because:

  • The capability can be expressed entirely as instructions + shell commands + existing tools (gh CLI for GitHub, git for workspaces, Hermes cronjobs for polling)
  • It wraps external CLIs (gh, git) that the agent calls via terminal
  • No custom Python integration or API key management is needed beyond what gh auth already provides
  • The orchestration logic is agent behavior (what to do when), not deterministic processing

Bundled vs Skills Hub: This is borderline. Autonomous issue resolution is a power-user workflow, not something every user needs on day one. Recommend bundled because it's broadly applicable — anyone with a GitHub repo and issues can use it, and it represents a major capability evolution (reactive → proactive agent).

What We'd Need

  1. Symphony orchestration skill — The main skill file teaching Hermes the poll-dispatch-resolve-land workflow
  2. WORKFLOW.md template — A reference template users can copy into their repos
  3. Helper scripts (optional) — Workspace lifecycle scripts (clone, cleanup, etc.)
  4. Workpad template — Markdown template for the persistent progress comment

Phased Rollout

Phase 1: Single-Repo Issue Resolution (MVP)

  • Skill that teaches Hermes to work on a single GitHub Issue end-to-end:
    • Read issue context (title, description, labels, comments)
    • Create isolated workspace (git worktree or fresh clone)
    • Execute the WORKFLOW.md instructions
    • Maintain a workpad comment on the issue for progress tracking
    • Run the git workflow (branch, commit, push, PR)
    • Verify CI + address review feedback before moving to Human Review
    • Status-driven routing: Todo → In Progress → Human Review → Done
  • This phase is fully manual trigger (user tells Hermes "work on issue #N")
  • Deliverables: symphony skill + WORKFLOW.md template

Phase 2: Automated Polling and Dispatch

  • Add cronjob-based polling: Hermes periodically checks for new issues matching configurable criteria (labels, assignee, state)
  • Claim mechanism: Hermes adds a label/comment to claim an issue before working on it
  • Concurrency control: respect max_concurrent_agents from WORKFLOW.md
  • Retry logic: exponential backoff on failures, continuation on clean exits
  • Stall detection: monitor workspace activity, kill stalled runs
  • Deliverables: Polling cronjob setup script + claim/release protocol

Phase 3: Full Lifecycle Management

  • Workspace lifecycle hooks (after_create, before_run, after_run, before_remove)
  • Multi-issue concurrent processing via hermes-agent-spawning or delegate_task
  • Land skill: sophisticated PR merge workflow with review monitoring
  • WORKFLOW.md hot-reload: detect changes and apply to future runs
  • GitHub Issue state reconciliation (terminate runs for closed/done issues)
  • Observability: structured logging, optional status dashboard
  • Linear adapter (optional, for teams using Linear)
  • Leverages Feature: Multi-Agent Architecture — Orchestration, Cooperation, Specialized Roles & Resilient Workflows #344 workflow DAG engine when available for complex multi-step issues

Pros & Cons

Pros

  • Paradigm shift — Moves Hermes from reactive (user asks) to proactive (Hermes finds and does work). This is the natural evolution for agent systems.
  • Achievable incrementally — Phase 1 alone (single-issue resolution skill) delivers immediate value with existing infrastructure. Each phase is independently useful.
  • Uses existing infrastructure — Cronjobs for polling, delegate_task for isolation, gh CLI for GitHub, git for workspaces. No new tools needed.
  • Apache-2.0 source — Symphony is fully open, spec is language-agnostic. We can adapt patterns freely.
  • Strong patterns to borrow — WORKFLOW.md convention, workpad pattern, status-driven FSM, proof-of-work protocol, and skill composition are all well-designed and battle-tested at OpenAI.
  • Harness engineering alignment — The broader "harness engineering" philosophy (making repos agent-ready with AGENTS.md, structured docs, linting for agents) is the future of development.
  • Differentiator — Few agent systems offer structured autonomous issue resolution. This would be a major capability differentiator.

Cons / Risks

  • Trust boundary — Autonomous code generation + PR creation + merge is high-stakes. Misconfigured workflows could land broken code. The "Human Review" gate is critical and must be non-bypassable by default.
  • Cost — Each issue resolution could burn significant tokens (20+ turns per issue, multiple retries). At scale (10 concurrent agents), costs multiply fast.
  • Scope creep risk — Symphony's WORKFLOW.md is 327 lines of detailed instructions. The skill could become unwieldy. Need to keep Phase 1 focused.
  • GitHub API rate limits — Polling + issue reads + PR creation + CI checks + review monitoring = lots of API calls. Need to respect rate limits.
  • Linear dependency — Symphony is designed around Linear. Adapting for GitHub Issues requires mapping concepts (Linear states → GitHub labels/milestones, Linear project → GitHub repo).
  • Workspace disk usage — Per-issue workspaces accumulate. Need cleanup logic.
  • Overlap with existing skills — The symphony skill would need to coordinate with github-issues, github-pr-workflow, and github-code-review skills rather than duplicate them.

Open Questions

  1. GitHub Issues mapping — How to represent Symphony's issue states (Todo, In Progress, Human Review, Merging, Done) in GitHub? Labels? Milestones? Project board columns? Labels are simplest (symphony:todo, symphony:in-progress, etc.) but project boards are more natural.
  2. Claim mechanism — How does Hermes "claim" an issue to prevent duplicate dispatch? Assignee field? A claimed-by-hermes label? A comment?
  3. WORKFLOW.md location — Should it be in the repo root (like Symphony) or in .hermes/WORKFLOW.md? Root is more visible; .hermes/ is cleaner.
  4. Scope of Phase 1 — Should Phase 1 support both new feature issues and bug fix issues, or focus on one type first?
  5. Safety defaults — Should the skill default to Human Review gate (never auto-merge) or allow configurable auto-merge for trivial issues?
  6. Integration with existing skills — Should the symphony skill import/reference existing Hermes skills (github-pr-workflow, etc.) or be self-contained?
  7. Multi-repo support — Should one Hermes instance be able to orchestrate across multiple repos (each with its own WORKFLOW.md)?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions