Overmind: the ONLY multi-player, multi-agent coding agent

Inspiration

AI coding assistants are powerful when used alone. But the moment a team tries to use them together on the same codebase—everything breaks.

We've all experienced it: merge conflicts, race conditions, simultaneous edits that step on each other, conflicts that spiral into chaos. Traditional AI tools assume a single user. What if we built an AI system designed specifically for multiplayer collaboration?

That's Overmind. We asked: What if teams could submit prompts in real-time, with the AI understanding not just individual requests, but the team's emerging narrative about what they're building? What if the system automatically clustered related work into features without manual documentation? What if every team member knew exactly what was being built and why?


What It Does

Overmind is a multiplayer terminal coding REPL where teams can collaborate on the same codebase in real-time, with AI-powered prompt evaluation and automatic code execution.

Core Features:

  1. WebSocket-Based Party System

    • Host creates a "party" with a 4-letter code
    • Teammates join from anywhere (local Wi-Fi or globally via ngrok)
    • Real-time sync of all team members and their activity
  2. Story-Based Prompts with Intelligent Routing

    • Plain English Queries: Team members type natural language prompts
    • Story Agent Evaluation (Gemini): Automatically classifies each prompt into three categories:
      • reject — Off-topic, not relevant to project
      • create_new — New, distinct feature direction
      • assign_existing — Continuation of existing feature
    • Auto-Execution: Non-rejected prompts execute immediately (no manual approval needed)
  3. Story Agent (Feature Clustering)

    • Continuous semantic clustering via Gemini
    • Learns what the team is building as they work
    • Maintains living project narrative in story.md
    • Automatically rejects off-topic queries
  4. Execution Pipeline

    • File locking prevents race conditions when multiple team members execute simultaneously
    • Modal orchestrator sandboxes code execution safely
    • Stages: Acquire locks → Sync workspace → Spawn agent → Agent executes → Extract & apply diffs
    • Real-time progress updates with execution stage visibility
  5. Beautiful Terminal UI (Ink + React)

    • Activity feed showing team actions in real-time
    • Member status tracking (idle/typing/executing)
    • Execution progress with stage indicators and spinner
    • Party panel with member list

How We Built It

Architecture Overview

Tech Stack

Frontend:

  • Ink + React for terminal UI
  • WebSocket client with auto-reconnect
  • Type-safe Zod validation

Backend:

  • Node.js with ESM
  • WebSocket server (ws library)
  • Commander for CLI

AI/ML Integration:

  • Gemini 3.0 Flash (via @google/genai) — Story Agent feature clustering
  • Tool-calling loops for iterative code generation (up to 10 rounds)

Infrastructure:

  • Supabase (PostgreSQL) — Features and queries persistence
  • Modal — Remote code execution sandboxes
  • GitHub API — Integration points

DevOps:

  • TypeScript strict mode
  • Zod runtime validation
  • File locking for race condition prevention
  • Git integration via simple-git

Module Structure

  1. Party System (src/server/party.ts)

    • Member tracking and lifecycle
    • Prompt queuing and sequencing
    • WebSocket broadcast orchestration
  2. Story Agent (src/server/story/agent.ts)

    • Feature clustering via Gemini with structured JSON
    • Three-way classification (reject/create_new/assign_existing)
    • Continuous database polling for unclustered queries
    • story.md regeneration after each evaluation
  3. Execution Pipeline (src/server/orchestrator/)

    • File lock management with retry logic
    • Workspace sync to Modal sandbox
    • Run polling and status tracking
    • Diff extraction and application to local codebase

Key Design Decisions

  1. Single Prompt Type: Story prompts for natural expression (no special syntax required)
  2. Stateful Story Agent: Team narrative emerges naturally without manual documentation
  3. Automatic Routing: Gemini decides reject vs. create vs. assign—no human gatekeeping
  4. Auto-Execution on Accept: Non-rejected prompts execute immediately (reduces friction)
  5. Pessimistic Locking: File locks prevent race conditions during concurrent execution
  6. Privacy-Preserving: Prompt content only visible to submitter and host, activity broadcast to all

Challenges We Ran Into

1. Modal API Async Integration

Challenge: Modal's blocking operations prevented truly concurrent execution of multiple prompts.

Solution: Leveraged Modal's async/await patterns:

  • Non-blocking workspace uploads
  • Proper concurrency control in polling loops
  • Reduced latency per operation

2. LLM Reliability (Tool-Calling Loops)

Challenge: Single-shot LLM calls often produced incomplete code or invalid syntax.

Solution:

  • Iterative tool-calling loop (up to 10 rounds max)
  • Agent reads files → analyzes → writes changes → reports
  • Gemini refines implementation across multiple turns
  • Enforces maxRounds limit to prevent infinite loops

3. Feature Clustering at Scale

Challenge: As features accumulate, context window grows—need efficient feature filtering.

Solution:

  • Limit active features to most recent N (default 15)
  • Continuous clustering of new queries prevents backlog
  • Database foreign keys maintain consistency

Accomplishments We're Proud Of

Story Agent (Intelligent Feature Clustering)

Why it matters: The system understands what the team is building and maintains a living narrative automatically. No manual documentation needed.

Technical feat:

  • Structured JSON evaluation from Gemini with reasoning
  • Continuous polling and clustering of new prompts
  • Three-way routing (reject/create/assign) with clear decision boundaries
  • story.md regeneration that evolves with the project

Zero Manual Synchronization

Why it matters: Teams don't fight over merge conflicts or manual syncing—WebSocket parties broadcast all changes in real-time. Everyone sees what everyone else is doing.

Technical feat:

  • Zod-validated WebSocket protocol
  • Deterministic reducer for state consistency
  • Privacy controls (prompt content only to authorized users)
  • Real-time activity feed

Parallel Execution with File Safety

Why it matters: Multiple team members can submit prompts simultaneously without conflicts or race conditions.

Technical feat:

  • Pessimistic file locking with retry logic
  • Workspace versioning per sandbox
  • Deterministic diff application
  • Atomic lock acquire/release

Continuous Feature Narrative

Why it matters: As the team works, story.md automatically evolves. No one has to maintain documentation—the AI learns what you're building.

Technical feat:

  • Queries → Database → Story Agent → Features → story.md (continuous loop)
  • Semantic clustering with project context
  • Regeneration after each evaluation

Production-Quality Code

Why it matters: 40 TypeScript files with zero unused imports, parameters, or variables. Every file has headers explaining purpose/invariants.

Technical feat: TypeScript strict mode, Zod runtime validation, comprehensive error handling, file-lock guarantees.


What We Learned

1. Real-time Multiplayer Systems Require Strong Consistency Guarantees

Building systems where multiple agents modify the same state simultaneously teaches you about:

  • Pessimistic vs. optimistic locking trade-offs
  • File-level locking as a simple, effective solution
  • Ensuring operations complete atomically or not at all

2. AI Classification Works Best With Rich Context

Gemini with project feature descriptions makes better decisions than larger models without context. We learned:

  • Structured JSON output forces clearer reasoning
  • Temperature = 0.0 matters for deterministic clustering
  • Decision heuristics matter more than raw model size

3. Type Safety Prevents Entire Classes of Bugs

Zod runtime validation + TypeScript strict mode caught bugs before they reached users. With 40 files coordinating over WebSockets:

  • Types become guardrails
  • Zod parsing prevents protocol violations
  • Exhaustive case handling catches logic errors

4. UX Matters in Developer Tools

The difference between "working..." and seeing real-time output (agent reading files, writing code, execution stages) makes the system feel trustworthy and alive.


What's Next for Overmind

Medium Term (Next Quarter)

  • Code Review Mode: Teammates can propose changes without auto-execution
  • Custom Evaluation Rules: Teams can define approval workflows

Long Term (Next Semester+)

  • Natural Language Debugging: "Why did that execution fail?" --> Agent explains
  • IDE Plugins: Bring Overmind into VSCode/JetBrains

Vision

Overmind should become the de facto operating system for team coding. Just like Git transformed how teams manage code, we want Overmind to transform how teams write code together with AI.

A world where:

  • Teams collaborate naturally without conflict friction
  • Project narrative emerges from team work, not documentation
  • AI agents are helpful teammates, not gatekeepers
  • Multiplayer coding is the default experience

Try It Out

# Clone & install
git clone git@github.com:atharva789/Overmind.git
cd Overmind
npm ci && npm run build

# Host a session
overmind host

# Join as teammate (different terminal)
overmind join <invite-code> 

# Try it:
# Terminal 1 (Host): See team activity, member statuses
# Terminal 2: Type "Add error handling to API calls"
# --  Story Agent evaluates
# -- Execution starts automatically
# -- DiffBlock shows changes
# -- Activity feed updates in real-time

Overmind: Multiplayer AI coding, without the chaos.

Built With

Share this project:

Updates