Skip to content

[NeMo RL] v0.7.0 Release Roadmap #2591

@anwithk

Description

@anwithk

NeMo RL v0.7.0 Roadmap

ETA: June 30, 2026

This is a community-facing snapshot of the work targeted for the NeMo RL v0.7.0 release.


1. Training - AutoModel backend

Feature Status Link
MiniMax-M2.7 support WIP #2251
DeepSeek V4 Flash support WIP #2331
Gemma 4 AutoModel support WIP #2212, PR #2224
Nemotron Nano v3 Omni AutoModel support WIP #2361, PR NVIDIA-NeMo/RL#2362
Mistral 3.5 AutoModel support WIP #2542

2. Training - Megatron backend

Feature Status Link
GLM 5.1 GRPO support WIP #2377, PR #2489
Kimi K2.6 support WIP issue #2412
ModelOpt low-precision QAT / quantized checkpoint training WIP #1099
E2E FP8 / long-context FP8 benchmark WIP

3. Inference - vLLM backend

Feature Status Link
W4A16 with QAT WIP
vLLM 0.19.2 update to match TRT-LLM performance WIP
Router Replay Rollouts (R3) WIP

4. Inference - Megatron backend

Feature Status Link
Numerics and speed features for dense models WIP

5. Algorithm & Dataset

Feature Status Link
PPO with MCore WIP #2048, PR #2530
PPO with dTensor WIP #2046, #2047, #2048, draft PR NVIDIA-NeMo/RL#2027, PR #2530
Multi-teacher off-policy distillation WIP #1700
Cross-tokenizer distillation WIP issue #1827, PR #2508
Hybrid reward with RLVR and reward model WIP

6. General Infra improvement

Feature Status Link
Control and data plane separation WIP #2414
Generation trajectory checkpointing WIP #2415
RDMA refit and delta refit WIP
NCCL reshard into NeMo RL WIP #2413

7. Performance improvements

Feature Status Link
SWE async RL benchmark with Qwen3.5 WIP
RL Training Perf features WIP

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationImprovements or additions to documentationFeature

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions