🪽 AgentFly¶

Training scalable LLM agents with RL (multi-turn, asynchronous tools/rewards, multimodal)

Overall Structure

Resources¶

AgentFly Paper 📜GitHub Repo 💻Weights & Biases 📈Models 🤗Tutorials 📚

AgentFly: Extensible and Scalable Reinforcement Learning for LM Agents

Language model (LM) agents have gained significant attention for their ability to autonomously complete tasks through interactions with environments, tools, and APIs. LM agents are primarily built with prompt engineering or supervised finetuning. At the same time, reinforcement learning (RL) has been explored to enhance LM's capabilities, such as reasoning and factuality. However, the combination of the LM agents and reinforcement learning (Agent-RL) remains underexplored and lacks systematic study. To this end, we built AgentFly, a scalable and extensible Agent-RL framework designed to empower LM agents with a variety of RL algorithms...

Read Paper

WandB

The training curves, parameters, rewards, and trajectories.

Training

HuggingFace

Check out the models on Hugging Face. Agent for code interpreter, retrieval, ScienceWorld, WebShop, etc.

Explore Model

Tutorials

Check out the tutorials on how to build agents, tools, rewards, and start training.

Read More