Self-Evolving · Human-on-the-Loop · End-to-End

EvoScientist

v0.0.3 Apache 2.0 Python 3.11+
$ uv tool install EvoScientist
Latest

News & Updates

13 Mar 2026 🚀 EvoScientist officially debuts!
11 Mar 2026 Technical Report is live! Check it out 👈
06 Mar 2026 🥇 Ranked #1 on DeepResearch Bench II at submission time! Leaderboard 👈
24 Nov 2025 🏆 6/6 accepted at ICAIS 2025 AI Scientist Track — Best Paper & AI Reviewer's Appraisal Award! Details 👈
Recognition

Awards & Benchmarks

ICAIS 2025 Awards
Best Paper Award
ICAIS 2025 · AI Scientist Track
6 out of 6 submitted papers accepted. Best Paper & AI Reviewer's Appraisal Award. Details →
DeepResearch Bench II #1
#1 DeepResearch Bench II
AgentResearchLab · Mar 2026
Ranked first on the DeepResearch Bench II leaderboard at submission time. Leaderboard →
AI-Generated Best Paper
AI-Generated Best Paper
Recognized for AI-authored research demonstrating end-to-end scientific quality. Details →
Quick Start

Up and running
in just seconds

One wizard configures everything — LLM provider, API keys, model selection, and workspace mode. Supports OAuth sign-in for Claude Code and Codex CLI users.

  • Install via uv tool install EvoScientist
  • Run EvoSci onboard and follow the prompts
  • Choose your LLM provider and enter your API key
  • Pick a model and workspace mode — done
~ — EvoSci onboard
System Design

Agent pipeline in action

From user input to experimental output — every step orchestrated, every result verified.

User CLI / API Main Agent planner-agent research-agent code-agent debug-agent data-analysis-agent writing-agent Results
Specialized Agents

Purpose-built for every phase
of the scientific process

From hypothesis to publication — each agent handles a dedicated stage of the experiment workflow.

Multi-Agent Team

6 specialized sub-agents — plan, research, code, debug, analyze, write — working in concert under a shared LangGraph state machine.

Persistent Memory

Context, preferences, and experimental findings survive across sessions. The system internalizes scholarly taste and builds on prior work.

Literature Research

Deep web search with structured 7-dimension reflection. Finds papers, methods, and baselines with enforced citation rigor.

Code Generation & Debug

Write, execute, and iteratively debug experiment code in a sandboxed workspace with 300s timeout, output limits, and auto-recovery.

Scientific Workflow

6-phase process: Intake → Plan → Execute → Evaluate → Write → Verify. Baseline-first design with one-variable iteration for scientific rigor.

MCP & EvoSkills

Plug in MCP servers or install research-lifecycle skills from GitHub. Compatible with other AI coding agents out of the box.

Multi-Provider

Your models, your choice

9 LLM providers supported. One config to switch. Auto-detect model names or specify full IDs directly.

A Anthropic
claude-opus-4-6claude-sonnet-4-6claude-haiku-4-5
O OpenAI
gpt-4oo3-minio1
G Google
gemini-2.5-progemini-2.5-flashgemini-2.0-flash
N NVIDIA
deepseek-r1llama-3.3-70bnemotron-ultra
S SiliconFlow
deepseek-v3qwen-plusglm-4
R OpenRouter
any modelunified APIauto-routing
V Volcengine
doubao-prodoubao-lite
D DashScope
qwen-maxqwen-turbo
Ollama & Custom
local / self-hosted
ollama: prefixcustom base_url
Multi-Channel

One agent, every surface

CLI as the hub. 10 messaging integrations — one agent session, any device.

💬
iMessage
✈️
Telegram
🎮
Discord
💼
Slack
💚
WeChat
🔔
DingTalk
🪶
Feishu
📧
Email
🐧
QQ
🔒
Signal
Official Skill Repository

EvoSkills

10 research-lifecycle skills covering the full pipeline from ideation to publication. Install all with a single command. Also compatible with Claude Code, Cursor, and other AI coding agents.

research-ideation idea-tournament experiment-pipeline experiment-craft paper-planning paper-writing paper-review paper-rebuttal academic-slides evo-memory
10
Skills
EvoSkills Research Pipeline Framework

Stay tuned.
Big things are coming.

Benchmarks · Web Interface · More agents · EvoSkills v2