v2.5.7 · open source · MIT · works in Claude Code · Cursor · Codex · Aider · Continue

Stop being the only person
who can ship.

You're the CTO. You're also the bottleneck. GreatCTO is 34 specialist agents that handle architecture, review, QA, security, and deploy — while you make two decisions per feature. Same plugin works in Claude Code, Cursor, OpenAI Codex CLI, Aider, and Continue via AGENTS.md + MCP.

no signup·runs locally·pay your own API·5 platforms
~ /your-saas — Claude Code · Cursor · Codex · Aider
$ /start "add Stripe subscriptions" archetype: commerce (Stripe dep → PCI-DSS triggered) scale: standard (5 agents · LLM agent: ~45min · human team: 2–3 days) ARCH-stripe-subscriptions.md ready >> DECISION 1: approve architecture? approved senior-dev TDD: 14 tests, 287 LOC /review ×12 P0:0 P1:1 P2:3 qa-engineer coverage 91% · PASS security PCI-DSS SAQ-A · PASS devops canary 5% → 20% → 100% >> DECISION 2: ship? ship it done. RELEASE-2026-05-01.md
Cross-platform · v2.4+

One config. Every AI-coding tool.

Run npx great-cto adapt --platform all once. The same archetype + compliance gates flow into every assistant your team uses — no re-onboarding, no duplicated rules.

Claude Code
CLAUDE.md + plugin (34 agents)
The full pipeline. SDLC orchestration, gates, memory.
Cursor
.cursorrules + .cursor/rules/*.mdc + MCP
Native VS Code / Cursor extension in packages/cursor-ext/.
OpenAI Codex CLI
AGENTS.md (verbatim)
De-facto cross-platform spec — no glue needed.
Aider
.aider.conf.yml + CONVENTIONS.md
Auto-test command points at great-cto ci.
Continue
.continue/rules.md + MCP
5 MCP tools: scan / list_rules / detect_archetype / estimate_cost / query_decisions.
Any MCP host
great-cto mcp --stdio or --sse
Stdio for Claude Desktop / Cursor; SSE for multi-client.
Single source of truth
Edit .great_cto/PROJECT.md once. Re-run adapt. Every consumer updates.
CI/CD ready
npx great-cto ci ./ drops into GitHub Actions with auto-detected ::error:: annotations + SARIF.
No vendor lock
Switch tools next quarter. Configs regenerate. Memory + ADR log persist.
The trade-off you shouldn't have to make

Move fast. Or move safely. Pick one.

— until now.

Before

  • You write the feature.
  • You spot the SQL injection at 11pm.
  • You catch the missing webhook signature in code review.
  • You triage the production alert at 2am.
  • You write the postmortem.
  • You forget the lesson 3 months later.
  • Human team ships in 2–3 days · burnout in 18 months.

After

  • You describe the feature in one sentence.
  • Architect drafts ARCH-doc — you approve in 30s.
  • 12-angle review catches the SQL injection (P0, blocked).
  • Security agent files CVE on stripe-js — fix proposed.
  • Pattern from last incident surfaces in agent's Step 0.
  • Cross-project memory makes Tuesday's fix Wednesday's prevention.
  • LLM agents ship in ~45min · sleep tonight.
That's not aspirational. That's last Tuesday on a real repo. See the actual log ↗
How it works

Three commands. The system does the rest.

No prompt-engineering. No agent-orchestration tutorial. No YAML.

01 · INSTALL

Drop into any repo

Detects archetype from manifests (15 signals → 14 archetypes), wires the gates, loads the right agents.

$ npx great-cto init archetype: web-service (95% confidence) security tier: standard 5 agents auto-loaded ready in 12 seconds
02 · START

Describe the feature in one sentence

Architect drafts the architecture doc. You approve, refine, or reject. That's decision one.

$ /start "add 2FA via TOTP" → ARCH-2fa.md ready → pipeline: standard | LLM agent: ~45min · human team: ~2 days >> DECISION 1: approve? _
03 · REVIEW · QA · SECURITY

12 angles fire in parallel

Performance, security, SQL safety, concurrency, privacy, API contracts. Every finding rated. P0 blocks the gate.

Performance · Security · SQL safety Concurrency · Privacy · API contracts Coverage 91% · 0 highs · CSO PASS >> DECISION 2: ship? _
04 · SHIP

Canary or done

5% → 20% → 100%. RELEASE doc auto-written. On-call notified. Memory updated for next time.

$ ship → canary 5% (3min) → 20% (5min) → 100% → RELEASE-2026-05-01.md → done.
The board you'll actually check

Six views. Real screenshots.
Live updates from your repo.

great-cto board at localhost:3141. Inbox · Kanban · Metrics · Agents · Memory · Public report. Vanilla HTML, zero deps — no Electron, no Tauri, no SaaS.

01 · KANBAN

Five columns. Uniform cards. Inline gate approval.

Gates · Backlog · In Progress · Done · Blocked. Cards are priority-coded, agent-tagged, with inline status / priority edit. ⌘K search across title / id / agent / labels. Live SSE — bd-CLI changes appear in the UI in <1s.

  • Filter bar: chip-toggle by agent, priority, label.
  • j / k navigation, Enter to open, ? for cheatsheet.
  • Click pipeline stage → drill-down to filtered kanban.
great_cto kanban board with 5 status columns
02 · METRICS

The numbers your CTO update needs.

Tasks shipped · LLM spend · cost-savings vs FTE · cycle time · QA pass rate · security blocks. 30-day daily-burn chart with budget alerts. A separate Agents tab shows per-agent time, LLM cost, and human-equivalent dollars at $150/hr.

  • Hero cards refresh every gate / verdict via SSE.
  • Cost panel pulls real plan data from docs/plans/.
  • Activity feed shows last 20 verdicts, cost-tagged.
great_cto metrics page with hero cards and cost chart
02b · AGENTS

Per-agent cost, time, and human equivalent.

29 specialist agents, each with its own time budget, LLM cost, and tasks-done counter. Compare to a human team at $150/hour and see the multiplier. Activity feed surfaces APPROVED / BLOCKED / FAIL verdicts with the agent that issued them.

agent cost breakdown with utilization bars and activity feed
03 · INBOX

Pick up where you left off.

Open a session — three columns greet you: In progress (your WIP tasks), Recent verdicts (what your agents finished while you slept), Decisions (every gate approval is logged with reasoning). Stop re-explaining your project to Claude.

  • Pending decisions — one-click approve / reject inline.
  • P0, blocked, stale (in-progress > 48h) auto-surfaced.
  • Append-only ~/.great_cto/decisions.md — query across all your projects.
great_cto inbox with Resume card and pending gates
05 · MEMORY

Four layers. Every Friday smarter.

PROJECT.md (archetype, goals) · lessons.md (per-project retros) · decisions.md (every gate approval with rationale) · verdicts/ (every agent verdict logged). Agents query memory before reading source files — solved problems stay solved.

memory browser showing 4 layers
Why you ship without waking up at 2am

Twelve independent reviews.
Each finds what the other eleven miss.

Cursor and Copilot run one review pass. We run twelve. Every finding rated P0 / P1 / P2. P0 blocks the gate. You can't accidentally ship a SQL injection.

01 · PERFORMANCE
N+1, hot loops
"GET /orders runs 47 queries for 1 page — N+1 in OrderService.list()"
P1
02 · SECURITY
Injection, IDOR, JWT
"JWT verified but iss/aud claims not checked — token from any tenant valid"
P0
03 · READABILITY
Naming, complexity
"32-line nested ternary in checkout.ts:284 — extract to function"
P2
04 · SQL SAFETY
Raw interpolation
"Unparameterized ORDER BY in customers.ts:91 — injection vector"
P0
05 · LLM TRUST
Prompt injection
"User input flows into system prompt without sanitization (RAG step 4)"
P0
06 · SIDE EFFECTS
Mutation in conditions
"Mutation inside if() shadows logging — duplicate webhook events"
P1
07 · DATA PRIVACY
PII, GDPR/HIPAA
"Email logged in request middleware:54 — GDPR Article 5(1)(c)"
P1
08 · ERROR HANDLING
Swallowed exceptions
"try/catch around 3 unrelated failure modes — debugging blackhole"
P1
09 · CONCURRENCY
Races, deadlocks
"Cache stampede on /pricing — 1.4s tail at p99 under load"
P1
10 · DEPS
CVEs, abandonment
"lodash 4.17.21 has CVE-2026-1234 — bump to 4.17.22"
P1
11 · API CONTRACTS
Breaking changes
"PATCH /users now requires 'role' field — breaks v1 clients"
P0
12 · DESIGN SYSTEM
Tokens, a11y
"23 hex codes hardcoded — design tokens from src/theme.ts not used"
P2
Auto-detected in 2 seconds

25 archetypes. Each with its own gates.

We scan your package.json, pyproject.toml, Cargo.toml, README, and code structure. Then we pick the right agent set, security tier, and compliance checklists.

🌐
web-service
gdpr
Learn more →
🤖
agent-product
eu-ai-act · owasp-llm
Learn more →
🧠
ai-system
eu-ai-act
Learn more →
💳
commerce
pci-dss · gdpr
Learn more →
🏦
fintech
pci · sox · kyc-aml
Learn more →
🩺
healthcare
hipaa · hitech
Learn more →
📱
mobile-app
store-policy
Learn more →
🔧
cli-tool
Learn more →
📦
library
Learn more →
🌍
browser-extension
csp · mv3
Learn more →
🎮
game
coppa · age-rating
Learn more →
⛓️
web3
soc2
Learn more →
📊
data-platform
gdpr
Learn more →
🛠️
devtools
openssf · soc2
Learn more →
📡
iot-embedded
iso27001 · etsi
Learn more →
☁️
infra
soc2
Learn more →
🛡️
regulated
custom
Learn more →
🏢
enterprise-saas
soc2 · sso · multi-tenant
Learn more →
🧪
mlops
eu-ai-act · drift
Learn more →
streaming
kafka · exactly-once
Learn more →
🏪
marketplace
kyc · 1099-k · dsa
Learn more →
📰
cms
dmca · wcag · seo
Learn more →
🎓
edtech
coppa · ferpa · wcag-aa
Learn more →
🏛️
gov-public
fedramp · nist-800-53 · 508
Learn more →
🛡️
insurance
naic · solvency · actuarial
Learn more →

Detection uses heuristics + (when low confidence) an Anthropic Haiku second-opinion call (~$0.001). You can override with --archetype NAME.

vs the obvious alternatives

"Why not just use Cursor?"

Cursor and Copilot are great editors. They are not SDLC pipelines. Here's what each does — honestly.

great_cto our pick Cursor Copilot Workspace Claude Projects
SDLC orchestration
Multi-agent SDLC pipeline 29 specialists
Auto archetype detection 17 types
12-angle code review single-pass single-pass
Compliance gates (PCI / HIPAA / SOX / EU AI Act)
Memory & visibility
Persistent memory decisions.md + verdicts chat-only chat scope
Multi-project view
Public sharable reports
Ownership & cost
Open source MIT
Runs locally partial
Pay your own API
Pricing $0 + your API $20/mo $39/mo $20/mo

We're not an editor — we orchestrate the process around your editor. Use Cursor inside the loop if you want.

The part no other tool has

Your agents get smarter every Friday.

Cursor forgets your project the moment you close the tab. GreatCTO synthesizes — into a 10–50 KB local memory that travels across sessions, machines, and projects.

L1 · PROJECT

What this project IS

Archetype, size, compliance frameworks, owners, team patterns. Set on first /start.

.great_cto/PROJECT.md
L2 · CODEBASE

Where the seams are

God-nodes, entry points, public API surface, routing. Built in 30s by zero-dep bash — no LLM cost.

.great_cto/CODEBASE.md
L3 · BRAIN

What you've learned

Patterns in use, what failed, decisions made. Synthesized weekly + after every postmortem.

.great_cto/brain.md
L4 · CROSS-PROJECT

Patterns that beat 4-hour debugs

Promoted via /crystallize after a P0. Surfaces in every agent's Step 0 — across every project, forever.

~/.great_cto/global-patterns/
94%
MTTR reduction · second occurrence
After every P0 incident, agents extract the detection order that worked. Next time the same shape of problem hits — any project, any week, any engineer — the pattern surfaces in the agent's Step 0. A connection-pool exhaustion that cost you 4 hours in Q1? 30 seconds in Q3.
Pricing

$34/month.
That's the whole bill.

GreatCTO is open source (MIT). You pay your own Anthropic API tokens. We don't see them. We don't bill you. Nothing to subscribe to.

Typical product team · 20 pipeline runs/month

quick · config / typo
$0.10
× 10
$1
quick · new endpoint
$1.00
× 6
$6
standard · feature
$5.00
× 3
$15
deep · cross-cutting
$12.00
× 1
$12
Total
~$34/mo
+ Routine triage auto-routed to Kimi K2 → 60–80% cost cut on log clustering
+ No per-seat. No SaaS. No data leaves your laptop.
Cursor Business
$40/seat/mo · ~$400/mo for a 10-eng team
Devin
$500/mo flat
GreatCTO
~$34/mo total · pay only when you ship
Honesty section

Who is this actually for?

✓ Perfect if you

  • Are a solo founder or technical CTO with 4–25 engineers
  • Use Claude Code, Cursor, OpenAI Codex CLI, Aider, or Continue
  • Ship to production weekly or faster
  • Are tired of being the bottleneck on every architecture call
  • Want to see what your code review actually catches

✗ Not yet if you

  • Have a 50+ engineering team with established RFC + review process
  • Don't use any AI-coding tool yet (start with Cursor or Claude Code first)
  • Run a regulated bank (we're not PCI/HIPAA/SOC2 audited — yet)
  • Need a managed SaaS with 99.99% uptime SLA — this runs on your laptop
  • Don't have any process to replace (start with one human reviewer first)
Quick start

30 seconds. No signup.
No credit card.

01 · init
$ npx great-cto init archetype detected 34 agents loaded ready in 12s
02 · pick your tool
$ npx great-cto adapt \ --platform all → AGENTS.md → CLAUDE.md → .cursorrules + .cursor/rules/ → .aider.conf.yml → .continue/rules.md
03 · start a feature
$ /start "add 2FA" → ARCH · review · QA · sec → DECISION 1? $ ship it → done.
CI gate
$ npx great-cto ci ./ \
   --sarif results.sarif
Auto-detects $GITHUB_ACTIONS — emits inline error annotations on PR diffs.
MCP server
$ npx great-cto mcp
Add to Claude Desktop / Cursor / Continue config — exposes 5 tools to chat.
Webhook receiver
$ npx great-cto serve
HMAC-verified GitHub / Sentry receivers + Slack/Discord/PagerDuty fan-out.
FAQ

The questions everyone asks before installing.

Will it commit to my repo without me knowing?
No. Every commit goes through your local git. Two human gates per feature. You can audit every diff before approve.
What if it makes a mistake on the architecture?
You reject the gate. Architect re-drafts with your reasoning. The conversation is in your terminal — same as a normal Claude Code, Cursor, or Codex session.
How is this different from Cursor / Aider?
Cursor writes code in the IDE. Aider edits files from CLI. GreatCTO sits one level above — it decides which agents to run, which gates to enforce, which compliance checks to load. v2.4+ runs inside all of them: npx great-cto adapt --platform cursor generates .cursorrules, --platform aider generates .aider.conf.yml, etc. — same archetype, same gates, every tool.
I use OpenAI Codex CLI / Continue / something else. Does it work?
Yes. Codex CLI reads AGENTS.md verbatim; we generate it. Continue reads .continue/rules.md; we generate that too. Plus the great-cto mcp server exposes 5 tools (scan, list_rules, detect_archetype, estimate_cost, query_decisions) over stdio + SSE — any MCP-aware host can call them. Setup snippets in the README →
What about CI? Do I need a chat to use this?
No. npx great-cto ci ./ drops into GitHub Actions / GitLab / any CI as a single step. Auto-detects $GITHUB_ACTIONS and emits inline ::error file=... annotations on PR diffs. Outputs SARIF for the GitHub Security tab + JUnit XML for test reporters. Exit 0 clean, 1 findings, 2 setup error.
Does it work on existing codebases?
Yes. /audit reads the repo, builds CODEBASE.md, generates a backlog of gaps. Tested on JS/TS, Python, Rust, Go. ~2 minutes for 100k LOC.
What about my secrets?
Nothing leaves your machine except Claude API calls (your tokens, your Anthropic account). The board, the memory, the patterns — all local files in .great_cto/. You can .gitignore them or commit them — your call.
Can I disable an agent I don't need?
Edit .great_cto/PROJECT.mdagents: [...]. Or override at runtime: /start "feature" --agents=architect,senior-dev,qa.
Will my engineers hate it?
It runs in whatever AI assistant they already use — Claude Code, Cursor, Codex CLI, Aider, Continue. Same UX as their normal session, same keyboard shortcuts. They'll notice slower first runs (2 min for the audit) and dramatically faster ship times (LLM agent ~45min vs human team 2–3 days). The gate prompts ask them, not their manager. Less ceremony, not more.
What happens if Anthropic changes pricing?
Routine triage already auto-routes to Kimi K2 (cheaper Sonnet-equivalent). You can pin a specific model in PROJECT.md. Worst case: it costs 2× for 6 weeks until we add the next provider. The plugin is MIT — you can fork.
Enough reading.

The bottleneck is you.
Stop being it.

$ npx great-cto init
60 seconds to install · 12 minutes to your first /start
Open source · MIT · made by an engineer who got tired of his own loops