Salazar

The tool that builds itself.

An autonomous coding orchestrator that builds software end-to-end from a markdown spec — no human code required. Planner/generator/evaluator agent loop using Claude via @anthropic-ai/claude-agent-sdk, with a terminal UI and contract-gated agent handoffs.

Named after the serpent — an ouroboros that eats its own tail. We pointed it at a spec for its own CLI and it built a 1,141-test terminal app in 4 hours. Then we pointed it at its own codebase in brownfield mode to add features to itself.

Proven output: mini-jwt — 38/38 features, 76 tests, 96% coverage, built in 70 minutes for $9.27.

The meta part: The CLI itself was built by Salazar. We wrote a spec for an Ink TUI, pointed Salazar at it, and walked away. 4 hours later: 63/63 features, 1,141 tests, fully functional CLI. The tool built its own interface.

Demo

TUI launcher flow

Architecture

┌─────────────────────────────────────────────────────────┐
│                    CLI / TUI (Ink)                      │
│  Onboarding, live progress, session history             │
│  Direct engine integration — no subprocess IPC          │
└──────────────────────┬──────────────────────────────────┘
                       │ imports
┌──────────────────────▼──────────────────────────────────┐
│                   ENGINE (TypeScript)                   │
│                                                         │
│  ┌──────────┐    ┌───────────┐    ┌───────────────────┐ │
│  │ Planner  │───▶│ Generator │───▶│ Hard Validators   │ │
│  │          │    │           │    │ (tsc, eslint,     │ │
│  │ Reads    │    │ Builds 1  │    │  build, test)     │ │
│  │ spec,    │    │ feature   │    │                   │ │
│  │ creates  │    │ per       │    │ Must all pass     │ │
│  │ feature  │    │ session   │    │ before proceeding │ │
│  │ list     │    │           │    └─────────┬─────────┘ │
│  └──┬───────┘    └─────▲─────┘              │           │
│     │                  │                    ▼           │
│     │ Zod              │           ┌───────────────┐    │
│     │ contract         │           │  Evaluator    │    │
│     │ gate             └───────────│  (adversarial │    │
│     │                  feedback    │   reviewer)   │    │
│     ▼                  loop        │               │    │
│  ┌──────────┐          if < 7.0    │  Zod contract │    │
│  │ Schema   │                      │  gate on      │    │
│  │ validate │                      │  output       │    │
│  │ + retry  │                      └───────────────┘    │
│  └──────────┘                                           │
│                                                         │
│  EventEmitter ──▶ TUI subscribes directly               │
│  SQLite ──▶ session history persisted locally           │
└─────────────────────────────────────────────────────────┘

Quick Start

# Install
npm i -g salazar-cli

# Or run directly
npx salazar-cli

# Build from a spec
salazar run my-app-spec.md

# With model overrides
salazar run spec.md --model claude-sonnet-4-6 --model-evaluator claude-opus-4-6

# With custom output directory
salazar run spec.md --output-dir ./my-project

How It Works

The Loop

Planner reads a product spec and decomposes it into features with BDD scenarios. Output validated against Zod schema — retries if the feature list doesn't match the contract.
Generator picks the next incomplete feature, gets a fresh Claude Code session, implements it with TDD, writes tests, and updates the feature list. One feature per session — clean context every time.
Hard Validators run automatically: TypeScript type checking, ESLint, build, test suite. The generator cannot skip these — if they fail, it gets the error output and retries (max 3 attempts).
Evaluator (moderate/complex features only) is a separate Claude Code session with an adversarial system prompt. Scores on spec compliance (35%), code quality (25%), security (25%), usability (15%). Minimum 7.0/10 to pass. Output validated against Zod schema — retries internally up to 3 times if the evaluation can't be parsed.
Setup and simple features skip the evaluator — validators are sufficient. This cuts ~50% of total runtime.

Contract-Gated Handoffs

Every agent-to-agent transition is validated by a Zod schema:

Handoff	Contract	On Failure
Planner → Orchestrator	`FeatureListSchema`	Retry planner with schema error
Generator → Validators	Exit code + test output	Retry generator with failure output
Evaluator → Orchestrator	`EvalOutputSchema`	Retry evaluator session (up to 3x)

Agents write what they want. Contracts enforce what we need. No prescriptive prompts — mechanical validation gates.

Complexity Routing

Complexity	Validators	Evaluator	Typical Time
`setup`	All gates	Skipped	~2-3 min
`simple`	All gates	Skipped	~3-4 min
`moderate`	All gates	Full review	~6-8 min
`complex`	All gates	Full review	~8-12 min

Model Tiers

salazar run spec.md \
  --model claude-sonnet-4-6 \          # Fast, good at coding
  --model-evaluator claude-opus-4-6    # Deep, good at critique

Project Structure

salazar/
├── src/
│   ├── index.ts              # CLI entry point (meow)
│   ├── engine/
│   │   ├── orchestrator.ts   # Core loop: planner → generator → evaluator
│   │   ├── contracts.ts      # Zod schemas for agent handoff validation
│   │   ├── agents/
│   │   │   ├── planner.ts    # Spec → feature_list.json
│   │   │   ├── generator.ts  # TDD feature implementation
│   │   │   └── evaluator.ts  # Adversarial scoring rubric
│   │   ├── client.ts         # Agent SDK options factory
│   │   ├── validators.ts     # Hard gates: tsc, eslint, build, test
│   │   ├── progress.ts       # feature_list.json tracking
│   │   ├── storage.ts        # SQLite via better-sqlite3
│   │   └── security.ts       # Bash command allowlist
│   ├── tui/
│   │   ├── app.tsx           # Ink TUI
│   │   └── hooks/
│   │       └── use-engine.ts # Direct engine event subscription
│   └── lib/
│       ├── types.ts          # All shared interfaces
│       ├── events.ts         # Typed EventEmitter
│       ├── config.ts         # ~/.salazar/config.json
│       └── paths.ts          # Runtime directories
├── prompts/                  # Agent system prompts
│   ├── planner.md
│   ├── generator.md
│   └── evaluator.md
├── package.json
└── tsconfig.json

The Meta Story

The harness built its own CLI. Here's what happened:

We wrote a spec for an Ink terminal UI
Pointed Salazar at it: salazar run tui_spec.md
Walked away
4 hours later: 63/63 features, 1,141 tests, fully functional CLI

Build Stats

	mini-jwt (proof)	CLI (meta)	Counter (smoke test)
Features	38/38	63/63	15/15
Tests	76	1,141	66
Coverage	96%	—	—
Time	70 min	~4 hours	33 min
Cost	$9.27	~$30	$9.79
Human code	0 lines	0 lines	0 lines

CLI Commands

salazar                           # Launch TUI
salazar run <spec.md>             # Build from spec (headless)
salazar run <spec.md> --output-dir ./out  # Custom output directory
salazar config                    # Configure models
salazar --help                    # Full help text

How It's Built

Salazar is a single TypeScript npm package. The engine spawns Claude Code sessions programmatically via @anthropic-ai/claude-agent-sdk. Each agent (planner, generator, evaluator) runs in its own Claude Code session with a focused system prompt, sandboxed tools, and a cost cap.

No raw API calls. No API keys needed. Uses your Claude Code authentication.

References

Effective Harnesses for Long-Running Agents — Anthropic Engineering
@anthropic-ai/claude-agent-sdk — TypeScript SDK for programmatic Claude Code sessions
Ink — React for CLIs
mini-jwt — First proof-of-concept output

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.claude/skills/salazar		.claude/skills/salazar
assets		assets
dashboard		dashboard
demo		demo
docs/superpowers/plans		docs/superpowers/plans
examples/left-pad		examples/left-pad
prompts		prompts
skills/salazar		skills/salazar
specs		specs
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
.npmignore		.npmignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
PROMPT.md		PROMPT.md
README.md		README.md
SESSION-CONTEXT.md		SESSION-CONTEXT.md
app_spec.md		app_spec.md
features.json		features.json
package-lock.json		package-lock.json
package.json		package.json
test-spec.md		test-spec.md
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vercel.json		vercel.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Salazar

Demo

Architecture

Quick Start

How It Works

The Loop

Contract-Gated Handoffs

Complexity Routing

Model Tiers

Project Structure

The Meta Story

Build Stats

CLI Commands

How It's Built

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Salazar

Demo

Architecture

Quick Start

How It Works

The Loop

Contract-Gated Handoffs

Complexity Routing

Model Tiers

Project Structure

The Meta Story

Build Stats

CLI Commands

How It's Built

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages