Test script to verify new models compatibility with RL pipeline

**Is your feature request related to a problem? Please describe.**

Different libraries/environments for training vs generation can lead to unexpected errors which are hard to debug since RL is robust enough that just observing rewards isn't a good enough check.

**Describe the solution you'd like**

We have already added a "Adding New Models" guide which explains the importance of tracking logprob errors. The guide should be accompanied by a script that the user is expected to run every time they bring in a new model with RL pipeline which essentially tests if training and inference frameworks for a new model are compatible.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test script to verify new models compatibility with RL pipeline #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Test script to verify new models compatibility with RL pipeline #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions