A workshop for forging coding-agent work into shipped code.
Smithy is an opinionated agent-native workflow harness. It polls Linear for issues, hands them to coding agents (Claude Code, Codex, or both), runs an adversarial cross-model review pass before any PR reaches a human, and writes back to Linear with its own service-account identity.
The harness itself is dumb. The intelligence lives in the agents that work on each ticket. Smithy is the workshop. The agents are the smiths. The tickets are the work.
| version | shape | state |
|---|---|---|
| v1 | Single Rust binary, launchd daemon, polls Linear, dispatches Claude or Codex per issue. v1 source remains in the repo root for reference. | Daemon disabled as of 2026-05-07 in preparation for the v2 cut. |
| v2 | Fork of OpenAI Symphony (Elixir, Apache-2.0). Inherits Symphony's runtime polish (Phoenix LiveView dashboard, supervised polling, Codex app-server integration). Layers on dual-runtime dispatch, cross-model adversarial review, Linear OAuth identity, label-gated autonomous merge, model-summarized run logs, cost rollups, max-retry circuit breaker. | In progress. Full architecture in v2/SPEC.md. |
Smithy v2 takes a Linear ticket from Ready for Dev and walks it through a state machine until it's either merged or back in the operator's lap with a clear reason.
Ready for Dev
│
▼
spec quality gate ──FLAG──▶ Backlog (needs-spec)
│ PROCEED
▼
runtime selection by label (codex / claude-code / default)
│
▼
In Progress ──▶ worker forges code, opens PR
│
▼
Adversarial Review ──▶ cross-model reviewer reads the diff
│
├─ FAIL ──▶ back to Todo with findings
├─ BLOCKED ──▶ harness-blocked label, into In Review
└─ PASS ──▶ In Review (or auto-merge if label set) ──▶ Done
Smithy v2's value is the layer it adds on top of Symphony's bones:
| Capability | What it does |
|---|---|
| Dual runtimes | Codex AND Claude Code, selected per ticket via label. |
| Cross-model review | If Claude built it, Codex reviews. If Codex built it, Claude reviews. The reviewer runs in a fresh session with no visibility into the build's reasoning. |
| Spec quality gate | Front-of-queue triage drops underspec'd tickets out of the queue with a structured comment listing what's missing. Trivial tickets bypass via title-as-spec proportionality. |
| Adversarial review pass | New Adversarial Review Linear state between In Progress and In Review. Reviewer reads the diff against a structured rubric. PASS / FAIL / BLOCKED. |
| Linear OAuth identity | Smithy's commits, comments, and state moves audit-trail to a service account, not the operator's personal API key. Revocable independently. |
| Label-gated autonomous merge | Tickets carrying auto-merge skip the human gate and merge after review passes. Opt-in per ticket. |
| SQLite history dashboard | Per-run model-summarized logs. Cost rollups (per-day, per-tenant, per-runtime). Deep-link to the raw stream-json. |
| Max-retry circuit breaker | After N attempts, applies harness-blocked, transitions to In Review, posts a multi-attempt failure summary. Closes the unbounded-retry footgun. |
| Bootstrap PR pattern | smithy bootstrap <repo> clones a target repo, generates AGENTS.md and .codex/skills/ via a fresh agent session, opens a PR. After merge the repo is in rotation. |
| Workpad reuse | Symphony's native pattern of one comment thread per ticket with progress notes accumulating. Reads like a story, not a stream of notifications. |
OpenAI's Symphony is genuinely good. Phoenix LiveView dashboard, OTP supervision, per-issue workspaces, structured logging, the Codex app-server runtime. Reproducing that surface in another language would burn weeks before a single opinion gets layered on top. The fork inherits the runtime; Smithy's value is the opinions.
Apache-2.0 lets the fork redistribute and credit upstream. Smithy tracks Symphony main with discipline. Upstream improvements pull through; Smithy's opinions live in additive files and clearly-marked patches.
Smithy belongs to a small toolset:
- Anvil is the adversarial-review state and reviewer agent, packaged as a standalone Rust daemon for vanilla Symphony users who don't want the rest of Smithy's opinion layer. github.com/shawnpetros/anvil
- Whetstone is a Rust executor for wave-protocol agent runs. Different shape, same family. Tickets aren't its native unit; "waves" are.
- Salazar is an autonomous code-from-spec orchestrator on the Claude Agent SDK. Planner / generator / evaluator loop with hard validator gates. github.com/shawnpetros/salazar
Forge metaphor for free: Smithy is the workshop, Anvil is the tool, Whetstone sharpens, Elixir is the language and the alchemical brew. The branding writes itself.
v2 install is in development. v1 install instructions live in docs/DEPLOY.md for reference but the v1 daemon is disabled. The v2 release cuts at v2.0.0 once Phase 0-10 (see v2/SPEC.md) ships.
If you want to follow along, star the repo and watch for v2.0.0-alpha-* tags.
Built on OpenAI Symphony. Credit where it's due: the runtime polish, the LiveView dashboard, the Codex app-server integration, all upstream. Smithy adds the opinion layer.
Apache-2.0 (matching Symphony upstream). NOTICE file in the v2 fork commit will carry the upstream attribution.
