Simplify get_train_dataloader in GRPO and RLOO#5276
Merged
albertvillanova merged 1 commit intoMar 13, 2026
Merged
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
songhappy
pushed a commit
to songhappy/trl
that referenced
this pull request
Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Simplify
get_train_dataloaderin GRPO and RLOO.This PR refactors the data loader creation logic in both the
GRPOTrainerandRLOOTrainerclasses to improve maintainability and reduce code duplication. The changes simplify theget_train_dataloadermethods by delegating to a shared_get_dataloaderfunction, and remove unused imports related to datasets and worker seeding.Note that
transformers.Trainer.get_train_dataloaderis refactored since v4.54.1 and we support >=4.56.2:get_XXX_dataloaderfrom Trainer transformers#38090Refactoring and maintainability improvements:
get_train_dataloadermethods in bothGRPOTrainerandRLOOTrainerby replacing custom logic with a call to the shared_get_dataloaderfunction, making the code easier to maintain and reducing duplication.Import cleanup:
datasets,DataLoader,partial, andseed_workerfrom both trainer files, as well as theis_datasets_availableutility, since these are no longer needed after the refactor.Note
Medium Risk
Touches the training input pipeline for
GRPOTrainerandRLOOTrainer; while intended to be behavior-preserving, any mismatch with priorDataLoaderparams (sampler/drop_last/worker seeding) could subtly affect batching or reproducibility.Overview
Refactors
GRPOTrainerandRLOOTrainerget_train_dataloaderto delegate dataloader construction to the sharedTrainer._get_dataloader, keeping the custom generation batch sizing (_train_batch_size * steps_per_generation) but removing the copy-pastedDataLoadersetup.Cleans up now-unused imports (
datasets,DataLoader,seed_worker,partial,is_datasets_available) and trims the associated maintenance note text.Written by Cursor Bugbot for commit 01b88cd. This will update automatically on new commits. Configure here.