Road to v1

The purpose of this issue is to list the tasks that need to be completed before we reach v1. This list is evolving and is modified based on recent discussions and progress.

## Documentation

- [x] Remove `how_to_train.md` #4267 
- [x] Remove `using_llama_models.md` #4268
- [x] Remove `logging.md` #4269
- [x] #4376
- [x] #4375 
- [x] #4377 
- [x] #4378
- [x] #4379 
- [x] #4381
- [x] #4382
- [x] #4383
- [x] #4384
- [x] #4385
- [x] #4386
- [x] #4396
- [x] #4397
- [x] #4407

## Examples

- [x] #4399
- [x] #4404

## Tests

- [x] #4401

## Main codebase

- [x] Add accuracy reward to the `trl.rewards` module https://github.com/huggingface/trl/pull/4270
- [x] Add an option (default to True) to use `RichProgressCallback` in scripts (`trl.scripts`).
- [x] #4398
- [x] Remove `log_example_reports.py` #4241
- [x] Remove `commands` directory #4258 
- [x] Remove `examples/research_projects` #4258
- [x] Remove `trl.extra.dataset_formatting` #4242 #4651
- [ ] #4387 (we may drop the support for FSDP1 post v1)
- [x] Remove `BestOfNSampler`. #4291 #4301
- [x] #4380
- [x] #4403
- [x] Refactor DPO to align implementation with SFT (WIP in #3906)
- [x] Tool calling for GRPO/RLOO (WIP in #4300)
- [x] ~Async generation for Online methods~
- [x] ~Bump transformers to v5~ we will keep v4 compatibility
- [ ] #4402 (we may do this post v1)

## Moving experimental features to experimental submodule

Discussed in #4223 for trainers

- [x] Move BCO to experimental submodule #4312
- [x] Move KTO to experimental submodule #4575
- [x] Move Nash-MD to experimental submodule #4477
- [x] Move ORPO to experimental submodule #4480
- [x] #4472
- [x] Move PPO to experimental submodule #4482 
- [x] Move PRM to experimental submodule #4483 
- [x] Move XPO to experimental submodule #4485 
- [x] #4395
- [x] #4400
- [x] Move Winrate callback to experimental #4558 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Road to v1 #4374

Documentation

Examples

Tests

Main codebase

Moving experimental features to experimental submodule

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Road to v1 #4374

Description

Documentation

Examples

Tests

Main codebase

Moving experimental features to experimental submodule

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions