GitHub - baidu-baige/LoongFlow: LoongFlow: A Thinking & Learning Framework for Expert-Grade AI Agents.

中文版

LoongFlow: A Thinking & Learning Framework for Expert-Grade AI Agents.

Set Creativity Free! LoongFlow turns your expertise into professional AI productivity.

LoongFlow is an open-source expert-grade Agent development framework.

Enable Agents to think and learn through the PES paradigm, and accumulate experience through iteration.

🚀 Quick Start • Examples • General-Evolve • ML-Evolve • Discussions

🚀 General-Evolve

General Evolve Agent

Efficient,stable driving of universal algorithm design and continuous evolution.

🔥 ML-Evolve

Machine Learning Agent

Full-process,autonomous construction and continuous evolutionary breakthrough.

⭐ LoongFlow

Universal Agent Framework

A Universal Agent Framework for Expert-Grade AI Productivity.

LoongFlow: Inspired by Wang Yangming's "Enlightenment at Longchang".LoongFlow is dedicated to breaking the barrier between Knowing and Doing. We enable wisdom to awaken through the unity of knowledge and action, ensuring that every drop of professional expertise is transformed into powerful AI productivity.

✨ Why LoongFlow?

An expert-grade Agent framework that thinks and learns. It empowers Agents to think like scientists, helping developers rapidly transform their professional expertise into expert-level Agents.

Intelligent Thinking: Innovative PES (Planning-Execution-Summary) Paradigm. LoongFlow empowers Agents with structured thinking to tackle long-range complex reasoning challenges. This enables Agents to iterate through high-difficulty tasks with the rigorous mindset of a human scientist.
Continuous Learning: Innovative Multi-Structure Fusion Memory. By actively generating model reasoning contexts, LoongFlow allows Agents to continuously synthesize experience during task iterations. This results in a "run-and-improve" mechanism, achieving lightweight learning and evolution without heavy retraining.

We believe that the key to designing an expert-level Agent capable of solving complex problems lies in the Agent’s thinking paradigm. The thinking paradigm determines the complexity of problems an Agent can handle and sets the ceiling for its effectiveness. LoongFlow is built specifically for complex tasks requiring long-range reasoning, helping developers rapidly build Agents with domain-expert performance.

Proven Achievements

Domain	Achievement	Example
Mathematical Challenges (Tao’s & AlphaEvolve sets)	Outperformed the best human results on 11 problems and surpassed AlphaEvolve’s results on 7 problems, achieving the latest SOTA.	Circle Packing
MLE-bench (Kaggle Challenges)	Validated across 40 Kaggle competitions, securing 22 Gold Medals.	Stanford-Covid-Vaccine

LoongFlow vs Traditional Agent Approaches:

Aspect	Prompt / Tool-Based Agents	OpenEvolve-Style Evolution	LoongFlow
Core Loop	Generate → Retry	Mutate → Select	Plan → Execute → Summary
Reasoning Depth	Shallow	Limited	Long-horizon, structured
Learning from Failure	❌	Partial	✅ Explicit reflection
Experience Reuse	❌	❌	✅ Structured memory
Stability	Fragile	Often unstable	Stable convergence
Best Use Case	Simple automation	Search-heavy tasks	Expert-level problem solving

Quick Start

Installation

LoongFlow requires Python 3.12 or higher.

# Install uv/conda and clone repository
uv: https://docs.astral.sh/uv/getting-started/installation/
Miniforge: https://conda-forge.org/download/

# Install with uv
cd LoongFlow
uv venv .venv --python 3.12
source .venv/bin/activate
uv pip install -e .

# Install with conda
cd LoongFlow
conda create -n loongflow python=3.12
conda activate loongflow
pip install -e .

Run Examples

Run General Evolve Agent

# Config LLM: Edit task_config.yaml, recommend to use gemini-3-pro-preview or deepseek-r1-250528
# Example: ./agents/general_evolve/examples/packing_circle_in_unit_square/task_config.yaml
# The model needs to configure providers as needed, default provider is openai. for example: openai/gemini-3-pro-preview
llm_config:
  url: "https://xxxxxx/v1"
  api_key: "******"
  model: "openai/gemini-3-pro-preview"

# Run your first evolve task, the evolution results are in the ./output directory
uv pip install -r ./agents/general_evolve/examples/packing_circle_in_unit_square/requirements.txt
./run_task.sh packing_circle_in_unit_square --background

# Check task log
tail -f ./agents/general_evolve/examples/packing_circle_in_unit_square/run.log

# Stop task
./run_task.sh stop packing_circle_in_unit_square

Run ML Evolve Agent

# Config LLM: Edit task_config.yaml, recommend to use gemini-3-pro-preview or deepseek-r1-250528
# Example: ./agents/ml_evolve/examples/ml_example/task_config.yaml
# The model needs to configure providers as needed, default provider is openai. for example: openai/gemini-3-pro-preview
llm_config:
  url: "https://xxxxxx/v1"
  api_key: "******"
  model: "openai/gemini-3-pro-preview"

# Init ml evolve
./run_ml.sh init

# Run your first evolve task, the evolution results are in the ./output directory
# ./run_ml.sh run <task_name> [--background] [other Python args]
./run_ml.sh run ml_example --background

# Check task log
tail -f ./agents/ml_evolve/examples/ml_example/agent.log

# Stop task
./run_ml.sh stop ml_example

How LoongFlow Works

LoongFlow is designed around a simple idea:

Expert-level performance emerges not from better mutations, but from better thinking, reflection, and accumulated experience.

To achieve this, LoongFlow organizes agent behavior into a thinking–learning–evolving loop.

From Evolutionary Agents to Thinking Agents

Frameworks such as OpenEvolve and AlphaEvolve demonstrated that agents can improve through iteration, evaluation, and selection.

This marked a clear step beyond static prompting.

However, in real-world expert tasks, purely evolutionary loops often struggle because:

Exploration is blind or weakly guided
Long-horizon reasoning breaks easily
Experience remains task-specific
Agents converge prematurely to local optima

The core issue is not evolution itself, but the lack of a structured thinking process.

LoongFlow addresses this by shifting the abstraction:

from evolving outputs to standardizing how agents think, act, and learn.

PES Thinking Paradigm

At the core of LoongFlow is the PES (Plan–Execute–Summary) thinking paradigm, inspired by how human experts conduct research:

Each agent iteration follows the same explicit structure:

Plan

Understand the task and constraints
Retrieve relevant past experience
Design a clear, high-quality execution blueprint

Planning ensures generation is deliberate rather than blind.

Execute

Perform structured experimentation
Verify intermediate results
Avoid low-value or redundant trials

Execution becomes controlled experimentation, not guesswork.

Summary

Reflect deeply on successes and failures
Extract reusable insights
Persist experience into structured memory

Summary prevents agents from repeating the same mistakes.

PES transforms evolution from a mutation-driven process into a reasoning-guided improvement loop.

Learning & Evolutionary Memory

Thinking alone is not enough. To improve over time, agents must remember, generalize, and escape local optima.

LoongFlow integrates PES with a hybrid evolutionary memory system:

Multi-Island + MAP-Elites to preserve diversity
Adaptive Boltzmann selection to balance exploration and exploitation
Global evolutionary tree memory for long-range experience retrieval

This allows agents to perform jump-style reasoning — leveraging past discoveries to move beyond incremental local search.

LoongFlow Examples

Mathematical Challenges (Tao’s & AlphaEvolve sets)

Problem	Previously best known	AlphaEvolve	LoongFlow Evolve Result	Details
Circle packing in a square	2.634 (Higher is Better)	2.6358627564136983	2.6359829624734026	packing_circle_in_unit_square
Circle packing in a rectangle	2.364 (Higher is Better)	2.3658321334167627	2.365832229500823	packing_circle_in_rectangle
Packing hexagons in hexagons	3.943 (Lower is Better)	3.930092	3.928906855463712	packing_hexagons_in_hexagons
Max to min ratios	12.89（Lower is Better）	12.88926611203463	12.889243547212832	max_to_min_ratios
Minimum Overlap Problem	0.380927 (Lower is Better)	0.380924	0.3809137564083654	minimum_overlap_problem
An uncertainty inequality	0.3523 (Lower is Better)	0.35209910442252773	0.352099104421844	uncertainty_inequality
Second autocorrelation inequality	0.88922 (Higher is Better)	0.8962799441554083	0.9027021077220739	second_autocorrelation_inequality
First autocorrelation inequality	1.5098 (Lower is Better)	1.5052939684401607	1.509527314861778	first_autocorrelation_inequality
Sums differences problems	1.059793 (Higher is Better)	1.1219357374860444	1.103534711409646	sums_and_differences_problems_1
heilbronn triangles	0.036（Higher is Better）	0.036529889880030156	0.0365298898793351	heilbronn_problem_for_triangles
heilbronn convex regions	0.0306（Higher is Better）	0.030936889034895654	0.030900663674639613	heilbronn_problem_for_convex_regions

Across 11 challenges in geometry and algebra, LoongFlow outperformed all known best results and surpassed AlphaEvolve on 7 specific problems, achieving the latest SOTA.

MLE-bench (Kaggle Challenges)

Problem	LoongFlow Evolve Result	Details
aerial-cactus-identification	🥇 Gold	aerial-cactus-identification
denoising-dirty-documents	🥇 Gold	denoising-dirty-documents
detecting-insults-in-social-commentary	🥇 Gold	detecting-insults-in-social-commentary
dogs-vs-cats-redux-kernels-edition	🥇 Gold	dogs-vs-cats-redux-kernels-edition
histopathologic-cancer-detection	🥇 Gold	histopathologic-cancer-detection
nomad2018-predict-transparent-conductors	🥇 Gold	nomad2018-predict-transparent-conductors
plant-pathology-2020-fgvc7	🥇 Gold	plant-pathology-2020-fgvc7
tabular-playground-series-dec-2021	🥇 Gold	tabular-playground-series-dec-2021
the-icml-2013-whale-challenge-right-whale-redux	🥇 Gold	the-icml-2013-whale-challenge-right-whale-redux
google-quest-challenge	🥇 Gold	google-quest-challenge
plant-pathology-2021-fgvc8	🥇 Gold	plant-pathology-2021-fgvc8
us-patent-phrase-to-phrase-matching	🥇 Gold	us-patent-phrase-to-phrase-matching
predict-volcanic-eruptions-ingv-oe	🥇 Gold	predict-volcanic-eruptions-ingv-oe
stanford-covid-vaccine	🥇 Gold	stanford-covid-vaccine

Validated across 40 Kaggle competitions within the MLE-bench, securing 22 Gold Medals. The full results will be released upon the completion of all remaining competitions.

Others

Additionally, validation was conducted on problems such as mathematical puzzles and MOE load balancing algorithms，Detailed examples can be found in Examples.

🧩 Advanced Usage

EvolveAgent

from evolux.evolve import EvolveAgent

# Config evolve agent
agent = EvolveAgent(
    config=config,
    checkpoint_path=checkpoint_path,
)

# Register worker（Implement the Planner, Executor, and Summary interfaces）
agent.register_planner_worker("planner", PlanAgent)
agent.register_executor_worker("executor", ExecuteAgent)
agent.register_summary_worker("summary", SummaryAgent)

# Run agent
result = await agent()

For more details, please refer to EvolveAgent

ReActAgent

from evolux.react import AgentContext, ReActAgent
from agentsdk.tools import TodoReadTool, TodoWriteTool, Toolkit

# Build agent context
toolkit = Toolkit()
toolkit.register_tool(TodoReadTool())
toolkit.register_tool(TodoWriteTool())

# Build default react agent
agent = ReActAgent.create_default(model=model, sys_prompt=sys_prompt, toolkit=toolkit)

# Run agent
result = await agent(message)

For more details, please refer to ReActAgent

Visualization

Real-time evolution tracking with interactive web interface:

# Launch visualization server
python agents/general_evolve/visualizer/visualizer.py --port 8888 --checkpoint-path output-circle-packing/database/checkpoints

Features:

🌳 Evolution tree with parent-child relationships
📈 Performance tracking across generations
🔍 Code diff viewer showing mutations
📊 Island map for visualizing the distribution of solutions

FAQ

💰 How much does it cost to run?

Like CirclePacking problem, if use Gemini 3 Pro, the cost is about $10 in total

🆚 How is LoongFlow related to OpenEvolve or AlphaEvolve?

OpenEvolve and AlphaEvolve explore evolutionary improvement through mutation and selection. LoongFlow builds on these ideas but introduces a higher-level abstraction:

A structured thinking and learning paradigm inspired by human experts.

Rather than optimizing mutations, LoongFlow focuses on how agents plan, execute, reflect, and accumulate experience across iterations.

🔧 Can I use my own LLM?

Yes! LoongFlow supports any OpenAI-compatible API:

Commercial: OpenAI, Google
Local: vLLM, sglang

Just set the llm_config in your config to point to your endpoint.

🤝 Contribution

We welcome contributions! Here's how to get started:

🍴 Fork the repository
🌿 Create your feature branch: git checkout -b feat-amazing-feature
✨ Add your changes and tests
📝 Commit with a clear message
🚀 Push and create a Pull Request

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

💬 Contact

Welcome to join our community on

Discord	Wechat

📜 License

LoongFlow is licensed under the Apache License 2.0.

📚 Citation

If you find this work useful, please consider citing:

@misc{LoongFlow2025,
      title={LoongFlow: Directed Evolutionary Search via a Cognitive Plan-Execute-Summarize Paradigm}, 
      author={Chunhui Wan and Xunan Dai and Zhuo Wang and Minglei Li and Yanpeng Wang and Yinan Mao and Yu Lan and Zhiwen Xiao},
      year={2025},
      eprint={2512.24077},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2512.24077}, 
}

🚀 Ready to build your expert agent?

Maintained by the LoongFlow community

If LoongFlow helps you, please consider starring this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
agents		agents
assets/images		assets/images
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
pyproject.toml		pyproject.toml
run_ml.sh		run_ml.sh
run_mlebench.sh		run_mlebench.sh
run_task.sh		run_task.sh
uv.lock		uv.lock

License

baidu-baige/LoongFlow

Folders and files

Latest commit

History

Repository files navigation

LoongFlow: A Thinking & Learning Framework for Expert-Grade AI Agents.

🚀 General-Evolve

🔥 ML-Evolve

⭐ LoongFlow

✨ Why LoongFlow?

Proven Achievements

LoongFlow vs Traditional Agent Approaches:

Quick Start

Installation

Run Examples

Run General Evolve Agent

Run ML Evolve Agent

How LoongFlow Works

From Evolutionary Agents to Thinking Agents

PES Thinking Paradigm

Learning & Evolutionary Memory

LoongFlow Examples

Mathematical Challenges (Tao’s & AlphaEvolve sets)

MLE-bench (Kaggle Challenges)

Others

🧩 Advanced Usage

EvolveAgent

ReActAgent

Visualization

FAQ

🤝 Contribution

💬 Contact

📜 License

📚 Citation

🚀 Ready to build your expert agent?

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages