idea-factory

One idea in. Working MVP out.

idea-factory turns Claude Code into a virtual startup.
You're the CEO — describe what you want in one line, and a team of AI agents builds it.

Quick Start · How It Works · 한국어 가이드

What's New in v7.1

v7.1 (2026-04-11) is a hotfix over v7 that removes blocking PreToolUse hooks which were stalling downstream autonomous workflows (see #2). templates/settings.json now ships with defaultMode: "bypassPermissions" and an empty PreToolUse — the deny list still blocks destructive ops (rm -rf /, sudo *, .env*, credentials, secrets).

v7 (2026-04-04) was the harness engineering overhaul: 11 battle-tested patterns from market-dashboard-v5 — Quality Ratchet, Protected Files, 5-Stage PR Pipeline, 6-Gate Deploy Consensus, CONTRACT FAQ, agent depth guidance, and more. See HARNESS-GUIDE.md for the full changelog.

v6.1 laid the foundation: 4-reviewer gate, isolated worktrees, two-pass evaluation, Playwright MCP, phase handoffs, Codex Gate. Informed by Anthropic's harness engineering research (March 2026).

Feature	What it does
4-Reviewer Gate	architect + critic + code-reviewer + qa-tester review in parallel (was 3)
Fresh Context Isolation	Every reviewer runs in an isolated worktree — no self-praise bias
Two-Pass Evaluation	Pass 1: adversarial defect hunt (mandatory). Pass 2: structured scoring (optional, fresh context). Eliminates "Evaluator Leniency"
Playwright MCP	qa-tester opens the live app in a real browser, clicks buttons, fills forms — not just code review
Phase Handoff Documents	MVP, Harden, Ship phases produce handoff docs preserving context across transitions
Codex Gate	Optional cross-model review via OpenAI Codex CLI for a second opinion on diffs
Safety via deny-list	`permissions.deny` blocks destructive ops (`rm -rf /`, `sudo `) and sensitive reads (`.env`, credentials, secrets). No blocking PreToolUse hooks — autonomous workflows stay zero-friction.
CLAUDE.md 80-Line Limit	Generated project CLAUDE.md stays under 80 lines (HumanLayer research: compliance drops beyond ~150 instructions)
Fix-Loop Circuit Breaker	Same failure 3 times = stop and escalate to CEO (no more infinite token-burning loops)
HARNESS-GUIDE.md	New design document explaining every architectural decision with evidence

Demo

One command. A complete MVP in under an hour.

$ claude
> /start-company 프리랜서 수입 지출 자동 관리 앱

[ANALYZE] analyst + architect analyzing in parallel...
  → Service: CashFreel (캐시프릴)
  → Type: SaaS — Freelancer tax prediction
  → Team: PM + Developer + Designer

[SCAFFOLD] Creating project from templates...
  → CLAUDE.md, agents, hooks, settings ✓
  → git init ✓

[KICKOFF] CEO, 4 quick questions:
  1. Design feel? → Toss style (minimal, big numbers)
  2. MVP scope? → Tax prediction + income/expense tracking
  3. Revenue? → Free first, decide later
  4. Income scope? → Domestic + international

[BUILD] ralph loop running MVP stories...
  ✅ MVP-001: Next.js + Tailwind + shadcn/ui
  ✅ MVP-002: Income registration (KRW + USD + EUR)
  ✅ MVP-003: Expense tracking + auto-categorization
  ✅ MVP-004: Real-time tax dashboard + charts
  ✅ MVP-005: Cash flow report + CSV export

[VALIDATE] 4 independent reviewers (isolated worktrees):
  ✅ architect (opus): no structural blockers for Phase 2
  ✅ critic (opus): no essence drift detected
  ✅ code-reviewer (opus): 0 critical, 2 medium (non-blocking)
  ✅ qa-tester (playwright): all 7 flows pass in real browser

→ MVP complete. Phase 2 ready when you are.

Result: CashFreel now has a working prototype. Next phase: connect real tax APIs, add authentication, harden security. CEO didn't write a single line of code.

The Problem

Vibe coding is fast, but chaotic. You get code — not a product.

Real startups don't just have developers. They have process: a PM who says "no", a designer who researches before drawing, a QA who breaks things on purpose, and a critic who asks "but why?"

idea-factory gives you both: the speed of AI + the discipline of a real team.

Tool	Approach	You need to be
Vibe coding	"Just build it"	A developer
gstack	Engineering team	A developer
idea-factory	Full startup team	Just the CEO

How It Works

You: /start-company a portfolio tracker for busy investors

  ANALYZE ──────── Two agents dissect your idea in parallel
     │              (market fit, tech stack, team composition)
     ▼
  SCAFFOLD ─────── Project created from templates, not from scratch
     │
     ▼
  KICKOFF ──────── 3-5 plain-language questions — no jargon, just choices
     │
     ▼
  BUILD MVP ────── Mock data first. Core flow only.
     │              Every feature checked: "Does this serve the Why?"
     ▼
  VALIDATE ─────── 4 reviewers in isolated worktrees (defects first):
     │              Architect + Critic + Code-Reviewer + QA (Playwright)
     │              ↳ critical defect? fix and retry. all clear? CEO confirms.
     ▼
  HARDEN ──────── Real APIs, tests, security — only after MVP is validated
     │
     ▼
  SHIP ────────── Deploy + retrospective

MVP-First Philosophy

Most AI tools rush to connect APIs and deploy. We do the opposite.

Phase	What happens	Real APIs?	Deploy?
1 — Prototype	Mock data, core flow, validate the "wow"	No	No
2 — Harden	Real APIs, error handling, tests, security	Yes	No
3 — Ship	Deploy after security audit passes	Yes	Yes

Why? Because connecting a payment API before knowing if anyone wants your product is a waste of everyone's time.

Essence Verification

Every feature is checked against your service's "Why":

essence.md
├── One-Line Definition: what this is
├── Why This Exists:     the problem it solves
├── Wow Factor:          what makes users go "wow"
├── Differentiator:      what competitors don't do
└── Key Metric:          the one number that matters

After every story: does this serve the Why?
At every gate: is the codebase drifting from the vision?
Drift too far → the system flags it and suggests a pivot

Install

# One-liner
curl -fsSL https://raw.githubusercontent.com/gguloadoong/idea-factory/main/install.sh | bash

# Or clone locally
git clone https://github.com/gguloadoong/idea-factory.git
cd idea-factory && bash install.sh

Prerequisites

Required	Optional
Claude Code CLI	oh-my-claudecode (for `ralph` autonomous loop)
Node.js 18+	Gemini CLI (external perspective)
Git

Usage

/start-company a portfolio tracking app for busy investors

That's it. The system will:

Analyze your idea and form a minimum team
Set up the project with proper structure
Ask you 3-5 simple questions
Start building autonomously

When does it ask you?

It asks	It doesn't ask
Design feel (A/B/C choices)	Tech stack decisions
MVP scope	Architecture choices
Revenue model	Code review results
"Is this the right direction?"	Bug fixes
API keys when actually needed	Anything it can decide

What's Inside

idea-factory/
├── skills/start-company/         # The trigger (/start-company)
│   ├── SKILL.md                   # execution flow (current: v7.1)
│   └── HARNESS-GUIDE.md           # design decisions + evidence (22 KB)
│
├── templates/                    # Scaffold copied into each new project
│   │                              # ── Core (every install gets these) ──
│   ├── CLAUDE.md.tmpl             # project constitution (80-line limit)
│   ├── settings.json              # permissions + deny-list (safety baseline)
│   ├── agents/                    # 7 roles: pm · developer · designer · architect · critic · code-reviewer · qa-tester
│   ├── hooks/                     # 18+ hooks: safety / quality / governance / loop-breaker
│   ├── documents/                 # PRD · essence · CONTRACT · handoff · quality-baseline
│   ├── scripts/                   # codex-review-gate · copy-drift · contract · temporal-lint
│   ├── .github/workflows/         # CI + PR labeling
│   │                              # ── Advanced (opt-in patterns) ──
│   ├── contract-rules/            # CONTRACT FAQ rules (drift guardrails)
│   ├── gate-presets/              # 6-Gate Deploy Consensus configs
│   ├── gate-rules.yml             # per-stage gate rules
│   ├── ratchet.yml.tmpl           # Quality Ratchet (regression floor)
│   ├── protected-files.yml        # protected-files hook allow-list
│   ├── .protected-files.tmpl      # (template of above for downstreams)
│   ├── handoff-checklist.md.tmpl  # phase handoff checklist
│   ├── research-report.md.tmpl    # researcher agent output template
│   ├── experiments/               # numerical tuning harness (v8)
│   ├── cron-bot/                  # scheduled bot scaffold (v8)
│   ├── settings-extensions/       # per-project-type settings overlays
│   ├── lints/temporal-leakage/    # date/time hardcoding lint
│   ├── workflows/                 # opinionated ralph/ulw workflows
│   ├── .githooks/                 # pre-commit hooks for downstreams
│   ├── .coderabbit.yaml           # CodeRabbit review config
│   ├── COPIED-FROM.md.tmpl        # template provenance stamp
│   └── vercel.json                # Vercel preview-deployment safety defaults
│
├── scripts/                      # Meta-utilities (NOT copied to downstreams)
│   ├── sync-downstream.sh         # push template updates to N downstream repos
│   ├── sync-lib.py                # sync library
│   ├── audit-backlog.py           # v8 backlog tracking
│   ├── check-contract.sh          # CONTRACT FAQ drift check
│   ├── check-copy-drift.sh        # template-vs-copy drift check
│   ├── lint-temporal-leakage.sh   # date lint
│   ├── merge-settings.sh          # settings.json merge helper
│   ├── record-failure.sh          # failure capture for learning layer
│   ├── run-gate.sh                # gate orchestrator
│   ├── tuning-gate.sh             # numerical tuning gate
│   └── validate-handoff.sh        # handoff validator
│
├── sync-manifest.json            # managed / computed / customized classification
├── downstream-registry.json      # downstream repos this factory tracks
├── install.sh                    # one-command installer
├── tests/                        # harness invariant tests
│
├── docs/                         # project documentation
│   ├── ko/                        # 한국어 가이드
│   └── research/                  # internal research memos
│
├── CLAUDE.md                     # working-in-this-repo rules (Claude Code)
├── AGENTS.md                     # OMC entry point
├── ARCHITECTURE.md               # system architecture overview
├── CHANGELOG.md                  # version history
├── CONTRIBUTING.md               # contribution guide
├── SECURITY.md                   # security policy
├── CODE_OF_CONDUCT.md            # community standards
└── LICENSE                       # MIT

For the rationale behind every design decision, see HARNESS-GUIDE.md.

Design Decisions

Decision	Why
Templates, not generation	Creating 30 files from scratch wastes the context window on boilerplate
ralph as backbone	Post-condition chaining between skills is unreliable; a state-machine loop isn't
4 isolated reviewers	Same-session role-play isn't real analysis; worktree-isolated agents with fresh context are
Defects first, scores second	Scoring alone triggers "Evaluator Leniency" (AI gives 9/10). Defect hunt first forces honesty
Playwright MCP for QA	Code review alone misses UI bugs; real browser interaction catches what humans catch
essence.md as North Star	Without it, features drift from the original vision within 2 sprints
CLAUDE.md under 80 lines	AI compliance drops beyond ~150 instructions; system prompt uses ~50, leaving ~100 for the project
Fix-loop circuit breaker	Without a cap, agents burn tokens in infinite retry loops on the same error

Examples

/start-company a pet health management app
/start-company subscription meal delivery for seniors
/start-company hospital booking and report automation tool
/start-company freelancer income/expense auto-tracker
/start-company AI-powered study planner for college students

Inspired By

gstack — Sprint pipeline, meta-skills
Citadel — Single entry point routing
oh-my-claudecode — Agent orchestration
everything-claude-code — Cross-platform skills

Contributing

PRs welcome! Whether it's new agent templates, better hooks, or translations.

License

MIT

_{Built with Claude Code. For founders who'd rather think about the product than the code.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

idea-factory

One idea in. Working MVP out.

What's New in v7.1

Demo

The Problem

How It Works

MVP-First Philosophy

Essence Verification

Install

Prerequisites

Usage

When does it ask you?

What's Inside

Design Decisions

Examples

Inspired By

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.claude-plugin		.claude-plugin
.github		.github
docs		docs
examples		examples
scripts		scripts
skills/start-company		skills/start-company
templates		templates
tests		tests
.coderabbit.yaml		.coderabbit.yaml
.gitignore		.gitignore
AGENTS.md		AGENTS.md
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
downstream-registry.json		downstream-registry.json
install.sh		install.sh
sync-manifest.json		sync-manifest.json

Folders and files

Latest commit

History

Repository files navigation

idea-factory

One idea in. Working MVP out.

What's New in v7.1

Demo

The Problem

How It Works

MVP-First Philosophy

Essence Verification

Install

Prerequisites

Usage

When does it ask you?

What's Inside

Design Decisions

Examples

Inspired By

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages