NeMo RL v0.7.0 Roadmap
ETA: June 30, 2026
This is a community-facing snapshot of the work targeted for the NeMo RL v0.7.0 release.
1. Training - AutoModel backend
| Feature |
Status |
Link |
| MiniMax-M2.7 support |
WIP |
#2251 |
| DeepSeek V4 Flash support |
WIP |
#2331 |
| Gemma 4 AutoModel support |
WIP |
#2212, PR #2224 |
| Nemotron Nano v3 Omni AutoModel support |
WIP |
#2361, PR NVIDIA-NeMo/RL#2362 |
| Mistral 3.5 AutoModel support |
WIP |
#2542 |
2. Training - Megatron backend
| Feature |
Status |
Link |
| GLM 5.1 GRPO support |
WIP |
#2377, PR #2489 |
| Kimi K2.6 support |
WIP |
issue #2412 |
| ModelOpt low-precision QAT / quantized checkpoint training |
WIP |
#1099 |
| E2E FP8 / long-context FP8 benchmark |
WIP |
|
3. Inference - vLLM backend
| Feature |
Status |
Link |
| W4A16 with QAT |
WIP |
|
| vLLM 0.19.2 update to match TRT-LLM performance |
WIP |
|
| Router Replay Rollouts (R3) |
WIP |
|
4. Inference - Megatron backend
| Feature |
Status |
Link |
| Numerics and speed features for dense models |
WIP |
|
5. Algorithm & Dataset
6. General Infra improvement
| Feature |
Status |
Link |
| Control and data plane separation |
WIP |
#2414 |
| Generation trajectory checkpointing |
WIP |
#2415 |
| RDMA refit and delta refit |
WIP |
|
| NCCL reshard into NeMo RL |
WIP |
#2413 |
7. Performance improvements
| Feature |
Status |
Link |
| SWE async RL benchmark with Qwen3.5 |
WIP |
|
| RL Training Perf features |
WIP |
|
NeMo RL v0.7.0 Roadmap
ETA: June 30, 2026
This is a community-facing snapshot of the work targeted for the NeMo RL v0.7.0 release.
1. Training - AutoModel backend
2. Training - Megatron backend
3. Inference - vLLM backend
4. Inference - Megatron backend
5. Algorithm & Dataset
6. General Infra improvement
7. Performance improvements