Hi,
I recently came across this really interesting blog on Putting RL back in RLHF.
It looks like unsloth supports many of the huggingface Trainer APIs, just wondering if it supports/plans to support this new RLOOTrainer? (Possibly related: #392)
Many thanks for any help, and this amazing lib!!
Hi,
I recently came across this really interesting blog on Putting RL back in RLHF.
It looks like unsloth supports many of the huggingface
TrainerAPIs, just wondering if it supports/plans to support this new RLOOTrainer? (Possibly related: #392)Many thanks for any help, and this amazing lib!!