NexRL

NexRL is a production-ready, distributed LLM post-training framework defined by its ultra-loosely-coupled philosophy. Its service-oriented architecture provides maximum flexibility and extensibility while maintaining clean abstractions and ease of use.

News 🚀

[2026.02.03] We release the support for On-Policy Distillation. Try with recipe.
[2026.01.23] Weaver v1.1.0 is released with full fine-tuning support! Try full fine-tuning with Weaver using NexRL (recipe).
[2026.01.21] We release a blog about the design and features of NexRL. Check it out!
[2026.01.15] NexRL v1.0.0 is here! Train NexAU agents with zero code modification—just configs and evaluators. New training-service mode supports Weaver and Tinker APIs for effortless cloud training.
[2025.11.18] NexRL goes open-source! Pre-release version now available.

Key Features

Training-as-a-Service & Rollout-as-a-Service: Unified API architecture that seamlessly supports different training and inference frameworks through service abstraction. Switch between training backends (FSDP, Megatron, etc.) and inference engines (SGLang, vLLM, TGI, etc.) without modifying your code.
Decoupled Modular Architecture: Clean separation of concerns with well-defined interfaces and extensible components. Each module operates independently, enabling easy customization and maintenance.
Zero-Code Agent-Training Support: Agents can seamlessly integrate with RL training without any RL-specific code modifications.
Intelligent Resource Management: Configurable placement and co-location of services for optimal performance in distributed environments
Comprehensive Monitoring: Built-in activity tracking and health checking system for production deployments
Robust Error Handling: Centralized error reporting and recovery mechanisms for production reliability

Architecture

NexRL follows a modular architecture where components communicate through explicit interfaces and APIs:

Core Components:

DataLoader: Provides training data (supports custom datasets)
RolloutWorker: Executes environment interactions (your agent goes here!)
TrajectoryPool: Manages trajectory collection and batching
Trainer: Applies algorithm logic (e.g., GRPO) and coordinates training through service APIs
WeightSyncController: Manages model weight synchronization between training and inference

Services:

Inference Service: Adopts the standard OpenAI API as the unified interaction interface with inference engines. This API-centric design ensures that the upper-layer modules can interact with various inference engines (such as SGLang, vLLM, etc.) in a consistent manner, eliminating the need for code modifications when switching between different inference engines.
Train Service: Utilizes standardized forward() and forward_backward() APIs to communicate with different training backends (including FSDP, Megatron, etc.). To achieve compatibility with diverse backends, we implement lightweight adapters tailored for each backend. These adapters translate the standardized API calls into backend-specific operations, enabling seamless switching of training backends without altering the core training logic.
Agent Service: Provides a streamlined integration path for agents to participate in RL training. Agents can directly push generated trajectories into the TrajectoryPool through this service, eliminating the need for developers to rewrite or modify agent code to adapt to RL training requirements.

Getting Started

Prerequisites

Python 3.12+
CUDA 12.8+ (for GPU support)
Ray 2.48+ (for distributed mode)
kubectl installed and configured
Access to a Kubernetes cluster
Volcano Scheduler installed in the cluster
High-performance network file system, e.g., GPFS

Check pyproject for the full dependency list.

Quick Start

Install NexRL:

git clone git@github.com:nex-agi/NexRL.git
cd NexRL

# Full install (training + all core dependencies)
pip install -e ".[core]"

# Or lightweight install (CLI job submission only, no torch/ray/etc.)
pip install -e .

Zero-Setup (Quickest!)

Run immediately with built-in defaults:

nexrl -m self-hosted \
  -c recipe/math/self_hosted.yaml \
  --run-nexrl

nexrl -m training-service \
  -c recipe/math/tinker.yaml \
  --run-nexrl

Uses public images (nexagi/nexrl:v1.4.0, lmsysorg/sglang:v0.5.4.post2) and /tmp storage - perfect for testing!

Development Setup

Use environment variables for quick configuration:

# Option 1: Use the provided setup script
source cli/setup_env.sh

# Option 2: Set variables manually
export NEXRL_STORAGE_PATH="/your/persistent/storage"
export NEXRL_WORKER_IMAGE="your-registry/nexrl:tag"
export WANDB_KEY="your-wandb-key"

# Then run
nexrl -m self-hosted -c recipe/your_recipe.yaml --run-nexrl

Production Setup

Configure cluster with custom images and persistent storage:

# Edit and apply ConfigMaps (one-time setup)
kubectl apply -f cli/setup/01-namespace.yaml
kubectl apply -f cli/setup/02-admin-config.yaml  # Edit first!
kubectl apply -f cli/setup/03-user-config.yaml   # Edit first!

# Run with production config
nexrl -m self-hosted \
  -c recipe/single_turn_math_qwen_2a5_7b/single_turn_math_qwen2a5_7b.yaml \
  --run-nexrl --tag prod-v1

CLI Options:

-m, --mode: self-hosted or training-service (required)
-c, --train-config: Path to training YAML (required)
-r, --run-nexrl: Auto-start training
-t, --tag: Custom job tag
--serving-only: [self-hosted] Only launch inference
--no-serving: [self-hosted] Skip inference

Configuration Priority:

Kubernetes ConfigMaps (production) → kubectl apply -f cli/setup/
Environment Variables (development) → source cli/setup_env.sh or export NEXRL_*
Built-in Defaults (testing) → public images, /tmp storage

Key Variables:

NEXRL_STORAGE_PATH: Storage path (default: /tmp/nexrl)
NEXRL_WORKER_IMAGE: Worker image (default: nexagi/nexrl:v1.4.0)
NEXRL_CONTROLLER_IMAGE: Controller image (default: nexagi/nexrl:v1.4.0)
NEXRL_INFERENCE_IMAGE: Inference image (default: lmsysorg/sglang:v0.5.4.post2)
WANDB_KEY: WandB API key (optional)

See also: cli/README.md for comprehensive documentation.

Documentation

User Guide: Complete guide for developing and integrating RL algorithms. Train NexAU agents with zero code modification—just provide configuration files and task-specific evaluators.
Developer Guide: Comprehensive documentation on architecture, APIs, and advanced usage
Configuration Examples: Ready-to-use training recipes for various models and tasks
Test Suite: Testing guide and examples

More on the Way

This release represents a foundational version of NexRL, designed to demonstrate our loosely-coupled and service-oriented architecture. We are actively working on preparing the code for open source and will release more of our work soon, including:

More model & agent support
Additional trainging and inference backend ntegrations
High-performance weight synchronization
Post-training algorithm exploration
More usability tools
...

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowlegement

NexRL aims for ultimate scalability and usability, fully embracing the open-source ecosystem to minimize code adaptation costs and improve experimental efficiency. NexRL is built upon several excellent open-source frameworks, including vLLM, SGLang, FSDP, Megatron, and VeRL (the adapter for the FSDP backend adopts the implementation from VeRL). Additionally, the zero-agent code development design of the Agent Service is inspired by Agent Lightning.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.ai-instructions		.ai-instructions
.claude		.claude
.github/workflows		.github/workflows
cli		cli
docker		docker
docs		docs
nexrl		nexrl
recipe		recipe
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NexRL

News 🚀

Key Features

Architecture

Getting Started

Prerequisites

Quick Start

Documentation

More on the Way

License

Acknowlegement

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NexRL

News 🚀

Key Features

Architecture

Getting Started

Prerequisites

Quick Start

Documentation

More on the Way

License

Acknowlegement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages