Skip to content

[Tracker][AsyncRL] Support asynchronous RL such as in-flight weight update #536

@CharlieFRuan

Description

@CharlieFRuan

Design doc (details on how to configure fully async RL in SkyRL, and the implementations): https://skyrl.readthedocs.io/en/latest/tutorials/fully_async.html

This issue tracks the support of fully async RL (synonymous to: in-flight weight update, and multi-turn partial rollout)

Recent literatures and findings in the RL community (AReal, PipelineRL, ScaleRL, etc.) demonstrate the importance of asynchronous RL for agentic trainings.

SkyRL-train currently supports one-step-off-policy training to allow training and rollout run concurrently (AReal paper figure 1 right): https://github.com/NovaSky-AI/SkyRL/tree/4d5ec4d13777ea1e3a36784201987ac77f6c6fb4/skyrl-train/examples/async

However, we would like to support more advanced schemes such as interruptible trajectories (or partial rollout; where the same trajectories can be completed by multiple model versions) (AReal paper figure 3).

We aim to support such a feature for all CustomGenerator -- that is, it should work out of the box for all agent harnesses (e.g. MiniSWEAgent, Terminus, etc.).

We follow the following steps to achieve this.

TODO:

References:

Image Image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions