Massively parallel autonomous coding: decompose any project into hundreds of tasks and execute them simultaneously in isolated cloud sandboxes.
Longshot takes a natural-language build request and turns it into working code. A root planner breaks the request into granular tasks, dispatches them to ephemeral Modal sandboxes running in parallel, and merges the results back into your repository. A reconciler monitors build health throughout and automatically spawns fix tasks when something breaks.
The entire system is stateless. Workers are ephemeral, state lives only in Git, and the orchestrator runs locally while all execution happens in the cloud.
flowchart TD
User(["👤 User Request"])
subgraph Python["Python Layer"]
CLI["main.py\nCLI + Rich Dashboard"]
end
subgraph Node["Node.js Orchestrator"]
RP["Root Planner\n(LLM)"]
SP["Subplanner\n(LLM)"]
TQ[("Task Queue")]
MQ["Merge Queue\n(serial: rebase / ff / merge)"]
RC["Reconciler\n(build/test sweeps)"]
FN["Finalization"]
end
subgraph Cloud["Modal Cloud Sandboxes (parallel)"]
W1["Worker"]
W2["Worker"]
W3["Worker"]
WN["Worker …"]
end
User --> CLI
CLI -->|"spawns subprocess"| RP
CLI <-->|"NDJSON stdout (4 Hz)"| Node
RP -->|"large tasks"| SP
RP --> TQ
SP --> TQ
TQ --> W1 & W2 & W3 & WN
W1 & W2 & W3 & WN -->|"result.json\n+ git branch push"| MQ
MQ -->|"conflict → new fix task"| TQ
MQ --> RC
RC -->|"fix tasks"| TQ
RC --> FN
Three TypeScript packages (scope @longshot/*):
packages/core— shared types, logging, LLM client, git utilitiespackages/orchestrator— planner loop, worker pool, merge queue, reconcilerpackages/sandbox— Modal sandbox definition and worker harness
Python layer (main.py, dashboard.py) wraps the Node orchestrator with a human-readable CLI and optional Rich TUI dashboard.
Python is a pure display layer. main.py spawns the Node orchestrator and reads its NDJSON stdout to render the dashboard at 4Hz.
NDJSON is the universal bus. Every status update, task event, and result flows as newline-delimited JSON from the orchestrator to the dashboard.
Modal currently provides the cloud sandboxes. Each worker runs in a fully isolated container with its own clone of the target repository.
Longshot already runs parallel planning and execution across isolated workers. The moonshot is to evolve this into a self-driving codebase loop that can handle larger scopes over longer time horizons while keeping human review as the final gate.
- Background agents: long-lived agents that can keep working asynchronously and report back with reviewable artifacts.
- Long-running execution: multi-hour task completion with explicit planning, progress memory, and follow-through.
- Issue-to-PR automation: tighter pipelines from problem intake to draft PRs, with CI + human approval gates.
- Agent reliability: better harness design for context rollover, retries, verification, and anti-fragile recovery.
- Pluggable sandbox providers: keep Modal as the current default, while adding adapter-based support for alternatives like E2B (and other compatible runtimes).
- Pluggable agent harnesses: expand beyond the current Pi harness to support multiple coding agents (for example OpenCode, Claude Code, and other compatible harnesses).
- Configuration-first selection: make sandbox + harness provider choice a deploy-time config concern instead of a code fork, so open-source contributors can swap backends safely.
- Capability-aware routing: normalize provider capabilities (filesystem, shell, network, session longevity, snapshots) and route tasks to the best compatible backend.
- Why We Built Our Background Agent (Ramp)
- Minions: Stripe's one-shot, end-to-end coding agents
- Minions Part 2 (Stripe)
- The Self-Driving Codebase — Background Agents and the Next Era of Enterprise Software Delivery
- The third era of AI software development (Cursor)
- Towards self-driving codebases (Cursor)
- Expanding our long-running agents research preview (Cursor)
- Python 3.12+
- Node.js 22+ (runtime engine for the orchestrator)
pipx install longshot
longshot --versionbrew tap andrewcai8/longshot
brew install longshot
longshot --versionOn first run, the CLI downloads a matching runtime bundle and installs Node runtime dependencies into a local cache (~/.longshot/runtime/<version> by default).
Optional overrides:
LONGSHOT_RUNTIME_URL: explicit URL for the runtime tarballLONGSHOT_RELEASE_REPO: GitHub repo slug for release downloads (default:Blastgits/longshot)LONGSHOT_CACHE_DIR: custom cache root for runtime assets
For one-time publishing setup (PyPI trusted publishing + Homebrew tap wiring) and first release steps, see:
Publishing is intentionally disabled by default via repository variable ENABLE_PUBLIC_RELEASE=false.
# Install dependencies
pnpm install
pnpm build
# Install Python dependencies
uv sync
# Optional: install a global CLI command from this repo
uv tool install --from . longshot
# Configure environment
cp .env.example .env
# Edit .env with your LLM credentials and target repo
# Run
uv run longshot "Build a REST API according to SPEC.md"
# With the Rich TUI dashboard
uv run longshot "Build a REST API according to SPEC.md" --dashboard
# Reset target repo to initial commit before running
uv run longshot "Build a REST API according to SPEC.md" --reset
# Debug logging
uv run longshot "Build a REST API according to SPEC.md" --debug
# If installed globally with uv tool, run without `uv run`
longshot "Build a REST API according to SPEC.md"All configuration is via environment variables in .env.
| Variable | Description |
|---|---|
GIT_REPO_URL |
URL of the target repository workers will clone and commit to |
LLM_BASE_URL |
Base URL of your LLM API endpoint (OpenAI-compatible) |
LLM_API_KEY |
API key for the LLM endpoint |
For multiple LLM endpoints with load balancing, use LLM_ENDPOINTS instead of LLM_BASE_URL/LLM_API_KEY:
LLM_ENDPOINTS=[{"name":"primary","endpoint":"https://...","apiKey":"sk-...","weight":100}]Each weight must be a positive number. LLM_MODEL, LLM_MAX_TOKENS, and LLM_TEMPERATURE remain global settings shared by all endpoints.
Worker sandboxes honor these weights when choosing which endpoint to call. Planner Pi sessions bind to one selected endpoint per session because the Pi harness can only talk to a single provider at a time.
| Variable | Default | Description |
|---|---|---|
MAX_WORKERS |
50 |
Maximum number of parallel workers |
WORKER_TIMEOUT |
1800 |
Worker timeout in seconds |
| Variable | Default | Description |
|---|---|---|
LLM_MODEL |
glm-5 |
Model name to pass to the API |
LLM_MAX_TOKENS |
65536 |
Max tokens per LLM request |
LLM_TEMPERATURE |
0.7 |
Sampling temperature |
| Variable | Default | Description |
|---|---|---|
SANDBOX_CPU_CORES |
4 |
CPU cores per Modal sandbox |
SANDBOX_MEMORY_MB |
8192 |
Memory per Modal sandbox (MB) |
SANDBOX_IDLE_TIMEOUT |
300 |
Sandbox idle timeout in seconds |
| Variable | Default | Description |
|---|---|---|
GIT_MAIN_BRANCH |
main |
Main branch name in the target repo |
GIT_BRANCH_PREFIX |
worker/ |
Prefix for worker branches |
| Variable | Default | Description |
|---|---|---|
HEALTH_CHECK_INTERVAL |
10 |
Reconciler health check interval in seconds |
MERGE_STRATEGY |
rebase |
Merge strategy: fast-forward, rebase, or merge-commit |
FINALIZATION_ENABLED |
true |
Run build/test sweep after all tasks complete |
FINALIZATION_MAX_ATTEMPTS |
3 |
Max reconciler fix attempts during finalization |
- Fork the repo
- Create a branch (
git checkout -b feat/your-feature) - Commit your changes
- Open a pull request against
main
See LICENSE.