Skip to content

Test script to verify new models compatibility with RL pipeline #2

@parthchadha

Description

@parthchadha

Is your feature request related to a problem? Please describe.

Different libraries/environments for training vs generation can lead to unexpected errors which are hard to debug since RL is robust enough that just observing rewards isn't a good enough check.

Describe the solution you'd like

We have already added a "Adding New Models" guide which explains the importance of tracking logprob errors. The guide should be accompanied by a script that the user is expected to run every time they bring in a new model with RL pipeline which essentially tests if training and inference frameworks for a new model are compatible.

Metadata

Metadata

Assignees

Labels

evaluationRelated to EvaluationinferenceInference Related

Type

No fields configured for Task.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions