Self-evolving multi-agent framework for biomedical research — uniting computational reasoning with physical experimentation through multimodal perception, self-evolving agents, and XR-enabled human-AI collaboration.
Le Cong1,2*, Zaixi Zhang3*, Xiaotong Wang1,2*, Yin Di1,2*, Ruofan Jin3, Michal Gerasimiuk1,2, Yinkai Wang1,2,
Ravi K. Dinesh1,2, David Smerkous4, Alex Smerkous5, Xuekun Wu2,6, Shilong Liu3, Peishan Li1,2,
Yi Zhu1,2, Simran Serrao1,2, Ning Zhao1,2, Imran A. Mohammad2,7,
John B. Sunwoo2,7, Joseph C. Wu2,6, Mengdi Wang3†
1Stanford School of Medicine · 2Stanford University · 3Princeton University · 4Oregon State University · 5University of Washington · 6Stanford Cardiovascular Institute · 7Stanford Cancer Institute
*Equal Contribution †Corresponding Author
- Overview
- Key Results
- System Architecture
- Quick Start
- CLI Reference
- Tool Library (98 Tools)
- Package Structure
- API Keys
- Optional Dependencies
- Citation
- Related Projects
- License
Modern science advances fastest when thought meets action. LabOS is the first AI co-scientist that unites computational reasoning with physical experimentation. It connects multi-model AI agents, smart glasses, and robots so that AI can perceive scientific work, understand context, and assist experiments in real time.
This repository provides the Dry-Lab computational core — a pip-installable Python package with a self-evolving multi-agent system purpose-built for biomedical research. Four specialised agents collaborate through a shared Tool Ocean of 98 biomedical tools, continuously expanding their capabilities at runtime.
| Component | Description |
|---|---|
| Manager Agent | Decomposes scientific objectives, orchestrates sub-agents, plans multi-step workflows |
| Researcher Agent | Executes bioinformatics analyses, runs code, queries databases and literature |
| Critic Agent | Evaluates result quality, identifies gaps, recommends improvements |
| Toolmaker Agent | Autonomously creates new tools when existing ones are insufficient |
Built on the STELLA framework, LabOS uses Gemini 3 (google/gemini-3 via OpenRouter) for all agents with a unified, single-model architecture.
LabOS consistently establishes a new state of the art across biomedical benchmarks:
| Benchmark | Score | vs. Next Best |
|---|---|---|
| Humanity's Last Exam: Biomedicine | 32% | +8% |
| LAB-Bench: DBQA | 61% | Top |
| LAB-Bench: LitQA | 65% | Top |
| Wet-Lab Error Detection | >90% | vs. ~40% commercial |
Across applications — from cancer immunotherapy target discovery (CEACAM6) to stem cell engineering and cell fusion mechanism investigation (ITSN1) — LabOS demonstrates that AI can move beyond computational design to active participation in the laboratory.
LabOS: Dry-Lab Computational Core
╔══════════════════════════════════════════════════════════════════╗
║ Manager Agent ║
║ (orchestration · planning · delegation) ║
╠════════════════════╦════════════════════╦════════════════════════╣
║ Researcher ║ Critic ║ Toolmaker ║
║ bioinformatics ║ evaluation ║ self-evolution ║
║ code execution ║ quality check ║ tool creation ║
╠════════════════════╩════════════════════╩════════════════════════╣
║ Tool Ocean (98 tools) ║
║ search · database · sequence · screening · web · devenv · … ║
╠══════════════════════════════════════════════════════════════════╣
║ 3-Tier Memory System ║
║ knowledge templates · collaboration workspace · session ctx ║
╚══════════════════════════════════════════════════════════════════╝
Self-Evolution Loop: When existing tools are insufficient, the Toolmaker Agent autonomously identifies resources from the web and literature, generates new Python tools, validates them, and registers them into the shared Tool Ocean — all at runtime.
pip install labosOr install from source:
git clone https://github.com/labos-ai/labos.git
cd labos
pip install -e .LabOS requires an OpenRouter API key to power all agents.
export OPENROUTER_API_KEY=your-key-hereOr use a .env file:
cp .env.example .env
# Edit .env and add: OPENROUTER_API_KEY=your-key-hereInteractive CLI:
labos runWeb UI (Gradio):
labos webProgrammatic:
import labos
agent = labos.initialize()
result = labos.run_task("Find recent papers on CRISPR-Cas9 off-target effects")
print(result)labos run # Interactive chat
labos web # Launch Gradio web UI
labos web --port 8080 # Custom port
labos web --share # Public Gradio link
labos --version # Show version
labos --help # Show help
All Flags
| Flag | Description |
|---|---|
--use-template |
Enable knowledge-base templates (default: on) |
--no-template |
Disable templates |
--use-mem0 |
Enable Mem0 enhanced memory |
--model MODEL |
OpenRouter model ID (default: google/gemini-3) |
--port PORT |
Web UI port (default: 7860) |
--share |
Create public Gradio link |
LabOS ships with 98 ready-to-use biomedical tools across 7 categories:
| Category | Count | Coverage | Examples |
|---|---|---|---|
| Database | 30 | UniProt, KEGG, PDB, AlphaFold, BLAST, Ensembl, gnomAD, ClinVar, GWAS, OpenTargets, STRING, Reactome, and 18 more | query_uniprot, blast_sequence |
| Screening | 24 | Virtual screening, drug-gene networks, pathway search, survival analysis, disease-gene associations, biomarker discovery | kegg_pathway_search, drug_gene_network_search |
| Sequence | 19 | Protein structure prediction (Boltz2), enzyme kinetics (CataPro), mutation scoring (ESM, FoldX, Rosetta), phylogenetics (IQ-TREE), protein redesign (LigandMPNN, Chroma) | run_boltz_protein_structure_prediction |
| Search | 10 | Google, SerpAPI, arXiv, PubMed, Google Scholar, GitHub code & repos | multi_source_search, query_pubmed |
| DevEnv | 10 | Shell commands, conda/pip, GPU status, script creation and execution, training log monitoring | run_shell_command, create_and_run_script |
| Web | 4 | URL content extraction, PDF parsing, DOI supplementary lookup | extract_url_content, extract_pdf_content |
| Biosecurity | 1 | Sensitive data sanitisation with configurable strictness levels | sanitize_bio_dataset |
labos/
├── __init__.py # Version, public API
├── core.py # Multi-agent orchestration engine
├── memory.py # 3-tier memory system
├── knowledge.py # TF-IDF + Mem0 knowledge base
├── ui.py # Gradio web interface
├── cli.py # CLI entry point
├── prompts/ # Agent prompt templates (YAML)
│ ├── manager.yaml
│ ├── researcher.yaml
│ ├── critic.yaml
│ └── toolmaker.yaml
└── tools/ # Biomedical tool library (98 tools)
├── __init__.py # Tool registry
├── llm.py # LLM helper for tool-internal calls
├── search.py # Web + academic search (10 tools)
├── web.py # URL / PDF content extraction (4 tools)
├── database.py # Biomedical database queries (30 tools)
├── sequence.py # Protein / enzyme analysis (19 tools)
├── screening.py # Virtual screening & discovery (24 tools)
├── biosecurity.py # Biosafety data sanitisation (1 tool)
└── devenv.py # Shell, conda, pip, scripts (10 tools)
| Key | Required | Purpose |
|---|---|---|
OPENROUTER_API_KEY |
Yes | Powers all LLM agents via OpenRouter |
SERPAPI_API_KEY |
No | Enhanced web search results |
MEM0_API_KEY |
No | Mem0 platform for enhanced memory |
pip install labos[mem0] # Mem0 enhanced memory
pip install labos[screening] # RDKit for virtual screening
pip install labos[all] # EverythingIf you find LabOS useful for your research, please cite:
@article{cong2025labos,
title = {LabOS: The AI-XR Co-Scientist That Sees and Works With Humans},
author = {Cong, Le and Zhang, Zaixi and Wang, Xiaotong and Di, Yin and Jin, Ruofan and Gerasimiuk, Michal and Wang, Yinkai and Dinesh, Ravi K. and Smerkous, David and Smerkous, Alex and Wu, Xuekun and Liu, Shilong and Li, Peishan and Zhu, Yi and Serrao, Simran and Zhao, Ning and Mohammad, Imran A. and Sunwoo, John B. and Wu, Joseph C. and Wang, Mengdi},
journal = {arXiv preprint arXiv:2510.14861},
year = {2025}
}- STELLA — The foundational self-evolving agent framework that LabOS builds upon
- LabSuperVision — The first VLM benchmark spanning biomedical and materials science laboratories
- ai4labos.com — Official LabOS project website
Apache 2.0 — see LICENSE.