Simplify get_train_dataloader in GRPO and RLOO by albertvillanova · Pull Request #5276 · huggingface/trl

albertvillanova · 2026-03-12T10:13:32Z

Simplify get_train_dataloader in GRPO and RLOO.

This PR refactors the data loader creation logic in both the GRPOTrainer and RLOOTrainer classes to improve maintainability and reduce code duplication. The changes simplify the get_train_dataloader methods by delegating to a shared _get_dataloader function, and remove unused imports related to datasets and worker seeding.

Note that transformers.Trainer.get_train_dataloader is refactored since v4.54.1 and we support >=4.56.2:

Refactor get_XXX_dataloader from Trainer transformers#38090

Refactoring and maintainability improvements:

Simplified get_train_dataloader methods in both GRPOTrainer and RLOOTrainer by replacing custom logic with a call to the shared _get_dataloader function, making the code easier to maintain and reducing duplication.

Import cleanup:

Removed unused imports of datasets, DataLoader, partial, and seed_worker from both trainer files, as well as the is_datasets_available utility, since these are no longer needed after the refactor.

Note

Medium Risk
Touches the training input pipeline for GRPOTrainer and RLOOTrainer; while intended to be behavior-preserving, any mismatch with prior DataLoader params (sampler/drop_last/worker seeding) could subtly affect batching or reproducibility.

Overview
Refactors GRPOTrainer and RLOOTrainer get_train_dataloader to delegate dataloader construction to the shared Trainer._get_dataloader, keeping the custom generation batch sizing (_train_batch_size * steps_per_generation) but removing the copy-pasted DataLoader setup.

Cleans up now-unused imports (datasets, DataLoader, seed_worker, partial, is_datasets_available) and trims the associated maintenance note text.

^{Written by Cursor Bugbot for commit 01b88cd. This will update automatically on new commits. Configure here.}

HuggingFaceDocBuilderDev · 2026-03-12T10:16:26Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec

Great!

Simplify get_train_dataloader in GRPO and RLOO

01b88cd

qgallouedec approved these changes Mar 12, 2026

View reviewed changes

albertvillanova merged commit 556646a into huggingface:main Mar 13, 2026
12 checks passed

albertvillanova mentioned this pull request Mar 16, 2026

Remove custom get_train/eval_dataloader from OnlineDPO #5291

Merged

songhappy pushed a commit to songhappy/trl that referenced this pull request Apr 20, 2026

Simplify get_train_dataloader in GRPO and RLOO (huggingface#5276)

29ef8cb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify get_train_dataloader in GRPO and RLOO#5276

Simplify get_train_dataloader in GRPO and RLOO#5276
albertvillanova merged 1 commit into
huggingface:mainfrom
albertvillanova:simplify-get-train-dataloader

albertvillanova commented Mar 12, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 12, 2026

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

albertvillanova commented Mar 12, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Mar 12, 2026

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

albertvillanova commented Mar 12, 2026 •

edited by cursor Bot

Loading