Skip to content

shawnpetros/smithy-v1

Repository files navigation

Smithy v2 - agent-native workflow harness

Smithy

A workshop for forging coding-agent work into shipped code.

Smithy is an opinionated agent-native workflow harness. It polls Linear for issues, hands them to coding agents (Claude Code, Codex, or both), runs an adversarial cross-model review pass before any PR reaches a human, and writes back to Linear with its own service-account identity.

The harness itself is dumb. The intelligence lives in the agents that work on each ticket. Smithy is the workshop. The agents are the smiths. The tickets are the work.

Status

version shape state
v1 Single Rust binary, launchd daemon, polls Linear, dispatches Claude or Codex per issue. v1 source remains in the repo root for reference. Daemon disabled as of 2026-05-07 in preparation for the v2 cut.
v2 Fork of OpenAI Symphony (Elixir, Apache-2.0). Inherits Symphony's runtime polish (Phoenix LiveView dashboard, supervised polling, Codex app-server integration). Layers on dual-runtime dispatch, cross-model adversarial review, Linear OAuth identity, label-gated autonomous merge, model-summarized run logs, cost rollups, max-retry circuit breaker. In progress. Full architecture in v2/SPEC.md.

What v2 does

Smithy v2 takes a Linear ticket from Ready for Dev and walks it through a state machine until it's either merged or back in the operator's lap with a clear reason.

Ready for Dev
   │
   ▼
spec quality gate ──FLAG──▶ Backlog (needs-spec)
   │ PROCEED
   ▼
runtime selection by label  (codex / claude-code / default)
   │
   ▼
In Progress  ──▶  worker forges code, opens PR
   │
   ▼
Adversarial Review  ──▶  cross-model reviewer reads the diff
   │
   ├─ FAIL ──▶ back to Todo with findings
   ├─ BLOCKED ──▶ harness-blocked label, into In Review
   └─ PASS ──▶  In Review (or auto-merge if label set) ──▶ Done

The opinion layer

Smithy v2's value is the layer it adds on top of Symphony's bones:

Capability What it does
Dual runtimes Codex AND Claude Code, selected per ticket via label.
Cross-model review If Claude built it, Codex reviews. If Codex built it, Claude reviews. The reviewer runs in a fresh session with no visibility into the build's reasoning.
Spec quality gate Front-of-queue triage drops underspec'd tickets out of the queue with a structured comment listing what's missing. Trivial tickets bypass via title-as-spec proportionality.
Adversarial review pass New Adversarial Review Linear state between In Progress and In Review. Reviewer reads the diff against a structured rubric. PASS / FAIL / BLOCKED.
Linear OAuth identity Smithy's commits, comments, and state moves audit-trail to a service account, not the operator's personal API key. Revocable independently.
Label-gated autonomous merge Tickets carrying auto-merge skip the human gate and merge after review passes. Opt-in per ticket.
SQLite history dashboard Per-run model-summarized logs. Cost rollups (per-day, per-tenant, per-runtime). Deep-link to the raw stream-json.
Max-retry circuit breaker After N attempts, applies harness-blocked, transitions to In Review, posts a multi-attempt failure summary. Closes the unbounded-retry footgun.
Bootstrap PR pattern smithy bootstrap <repo> clones a target repo, generates AGENTS.md and .codex/skills/ via a fresh agent session, opens a PR. After merge the repo is in rotation.
Workpad reuse Symphony's native pattern of one comment thread per ticket with progress notes accumulating. Reads like a story, not a stream of notifications.

Why fork

OpenAI's Symphony is genuinely good. Phoenix LiveView dashboard, OTP supervision, per-issue workspaces, structured logging, the Codex app-server runtime. Reproducing that surface in another language would burn weeks before a single opinion gets layered on top. The fork inherits the runtime; Smithy's value is the opinions.

Apache-2.0 lets the fork redistribute and credit upstream. Smithy tracks Symphony main with discipline. Upstream improvements pull through; Smithy's opinions live in additive files and clearly-marked patches.

The naming family

Smithy belongs to a small toolset:

  • Anvil is the adversarial-review state and reviewer agent, packaged as a standalone Rust daemon for vanilla Symphony users who don't want the rest of Smithy's opinion layer. github.com/shawnpetros/anvil
  • Whetstone is a Rust executor for wave-protocol agent runs. Different shape, same family. Tickets aren't its native unit; "waves" are.
  • Salazar is an autonomous code-from-spec orchestrator on the Claude Agent SDK. Planner / generator / evaluator loop with hard validator gates. github.com/shawnpetros/salazar

Forge metaphor for free: Smithy is the workshop, Anvil is the tool, Whetstone sharpens, Elixir is the language and the alchemical brew. The branding writes itself.

Install

v2 install is in development. v1 install instructions live in docs/DEPLOY.md for reference but the v1 daemon is disabled. The v2 release cuts at v2.0.0 once Phase 0-10 (see v2/SPEC.md) ships.

If you want to follow along, star the repo and watch for v2.0.0-alpha-* tags.

Credits

Built on OpenAI Symphony. Credit where it's due: the runtime polish, the LiveView dashboard, the Codex app-server integration, all upstream. Smithy adds the opinion layer.

License

Apache-2.0 (matching Symphony upstream). NOTICE file in the v2 fork commit will carry the upstream attribution.

About

Agent-native workflow harness. Fork of OpenAI Symphony with dual runtimes (Codex + Claude Code), cross-model adversarial review, Linear OAuth identity, label-gated autonomous merge, model-summarized run logs. v2 in progress; see v2/SPEC.md.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages