deep-loop

Two-tier autonomous research: experiments + meta-analysis

Built on Karpathy's autoresearch and ShinMegamiBoson's OpenPlanter. deep-loop combines both into a single system where one agent runs experiments and another periodically steps back to find what's working and generate smarter hypotheses.

What It Does

deep-loop has two modes:

Mode 1: ML Experiments (autoresearch + meta-analysis)

A Claude Code agent modifies train.py, runs 5-minute training jobs on GPU, logs results, and iterates. Every 12 experiments (or after a significant improvement), OpenPlanter analyzes all results to identify patterns and propose smarter hypotheses.

Mode 2: Domain Research (web search + synthesis)

A Claude Code agent picks questions from program.md, searches the web, synthesizes findings into report.md, and records entries in knowledge_index.tsv. Every 5 entries, the meta-analysis tier evaluates research quality and rewrites the strategy.

Both modes share the same core loop: experiment → record → meta-analyze → adapt → repeat.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         orchestrate.py                          │
│                    (launches and monitors)                      │
└─────────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Claude Code Agent                           │
│                                                                 │
│   reads: program.md (research direction / hypotheses)           │
│   edits: train.py (ML mode) or report.md (research mode)       │
│   runs:  uv run train.py (ML) or web search (research)         │
│   logs:  results.tsv (ML) or knowledge_index.tsv (research)    │
│                                                                 │
│   At regular intervals:                                         │
│   ───────────────────────────────────────────────────────────   │
│                               │                                 │
│                               ▼                                 │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │               meta_analyze.py                           │   │
│   │                                                         │   │
│   │   reads: results + git log + current state              │   │
│   │   calls: OpenPlanter or Anthropic API                   │   │
│   │   writes: next-hypotheses.md / rewrites program.md      │   │
│   └─────────────────────────────────────────────────────────┘   │
│                               │                                 │
│                               ▼                                 │
│   Agent adapts strategy based on meta-analysis output           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Quick Start

ML Mode (GPU required)

# Clone and setup
git clone https://github.com/kilroycreative/deep-loop && cd deep-loop
./setup.sh

# Run in tmux
tmux new-session -s deep-loop
python orchestrate.py --tag exp1

# Monitor from another terminal
python orchestrate.py --status

Domain Research Mode

Edit program.md — replace YOUR_TOPIC_HERE with your research topic
Run /loop in Claude Code (reads .claude/commands/loop.md)
The agent will search, synthesize, and build report.md autonomously
Every 5 entries, meta-analysis evaluates and adapts the strategy

Files

File	Role	Modify?
`program.md`	Research direction + strategy (mutable by meta-agent)	Set topic, then let meta-agent evolve
`CLAUDE.md`	Research constitution (invariants)	Rarely
`train.py`	Model + training loop (ML mode)	Agent only
`prepare.py`	Data pipeline + eval (from autoresearch)	NEVER
`orchestrate.py`	Main entry point, launches agent	No
`meta_analyze.py`	Meta-analysis integration	No
`notify.py`	Sends events on breakthroughs	No
`report.md`	Research output (research mode)	Agent only
`knowledge_index.tsv`	Research audit trail	Agent only
`process_log.md`	Meta-analysis methodology log	meta_analyze.py only
`results.tsv`	Experiment log (ML mode)	Agent only
`openplanter/`	OpenPlanter agent source	No

Cost Estimate

For an overnight ML run (~12 hours, ~100+ experiments):

Component	Cost
H100 compute (~12h @ $2/hr)	~$24
Claude Opus agent (experiments)	~$20
Claude Sonnet meta-analysis (~8 runs)	~$0.50
Total	~$45

Domain research mode costs vary by topic breadth — roughly $5–15 of API usage per 20-entry research session.

Notifications

When breakthroughs occur, notify.py pings OpenClaw:

# ML mode — val_bpb threshold crossed
python notify.py --event breakthrough --val 0.9891

# Research mode — significant insight
python notify.py --event breakthrough --val "insight: discovered dominant standard"

Commands

# ML: Start experiment loop
python orchestrate.py --tag <name>

# ML: Check progress
python orchestrate.py --status

# ML: Run meta-analysis manually
python orchestrate.py --meta-only

# Research: Start autonomous research loop
# (run /loop in Claude Code)

Attribution

autoresearch by Andrej Karpathy — the autonomous ML experiment loop that deep-loop's inner tier is built on
OpenPlanter by ShinMegamiBoson — the meta-analysis agent that powers deep-loop's outer tier

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deep-loop

What It Does

Mode 1: ML Experiments (autoresearch + meta-analysis)

Mode 2: Domain Research (web search + synthesis)

Architecture

Quick Start

ML Mode (GPU required)

Domain Research Mode

Files

Cost Estimate

Notifications

Commands

Attribution

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.claude/commands		.claude/commands
openplanter		openplanter
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
knowledge_index.tsv		knowledge_index.tsv
meta_analyze.py		meta_analyze.py
notify.py		notify.py
orchestrate.py		orchestrate.py
prepare.py		prepare.py
process_log.md		process_log.md
program.md		program.md
pyproject.toml		pyproject.toml
report.md		report.md
results.tsv		results.tsv
setup.sh		setup.sh
train.py		train.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

deep-loop

What It Does

Mode 1: ML Experiments (autoresearch + meta-analysis)

Mode 2: Domain Research (web search + synthesis)

Architecture

Quick Start

ML Mode (GPU required)

Domain Research Mode

Files

Cost Estimate

Notifications

Commands

Attribution

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages