🤗 HuggingFace | 🤖 ModelScope | 📰 Blog | 📑 Paper
qqr (a.k.a. hilichurl) is a lightweight, non-intrusive extension for slime. It seamlessly integrates the Model Context Protocol (MCP) standard to enable the evolution of open-ended agents via ArenaRL.
- ArenaRL Algorithm: Full implementation of the core algorithms described in the paper. It includes built-in topologies for Anchor-Based, Round-Robin, Swiss-System, Double-Elimination, and Seeded Single-Elimination tournaments.
- Built for Open-Ended Agents: Specifically engineered to tackle discriminative collapse in complex, open-ended tasks, ensuring continuous policy improvement via relative ranking even when reward model scores stagnate.
- MCP Support: Seamlessly integration with the MCP standardizes the decoupling of LLM inference and tool environments. Developers can reuse existing MCP Servers as training environments without rewriting interfaces.
- High-Performance Training: Built on top of
slime(tested withv0.2.1) to deliver high-throughput, distributed rollout generation and training for large-scale agent evolution.
To get started, first ensure slime is installed (refer to Quick Start). Then install qqr from source:
git clone https://github.com/Alibaba-NLP/qqr.git
cd qqr
pip install -e .Run the travel experiment quickly with the following command:
bash scripts/travel/run-qwen3-8B.shYou can configure the experiment in qqr/examples/travel/config.py.
slime: For providing a powerful post-training framework.
openai-agents-python: For providing excellent MCP interfaces.
If you use qqr or the ArenaRL algorithm in your research, please cite our paper:
@misc{zhang2026arenarlscalingrlopenended,
title={ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking},
author={Qiang Zhang and Boli Chen and Fanrui Zhang and Ruixue Ding and Shihang Wang and Qiuchen Wang and Yinfeng Huang and Haonan Zhang and Rongxiang Zhu and Pengyong Wang and Ailin Ren and Xin Li and Pengjun Xie and Jiawei Liu and Ning Guo and Jingren Zhou and Zheng-Jun Zha},
year={2026},
eprint={2601.06487},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2601.06487},
}