Does unsloth support/plan to support `RLOOTrainer`?

Hi, 

I recently came across this really interesting blog on [Putting RL back in RLHF](https://huggingface.co/blog/putting_rl_back_in_rlhf_with_rloo).

It looks like unsloth [supports](https://huggingface.co/docs/trl/en/sft_trainer#accelerate-fine-tuning-2x-using-unsloth) many of the huggingface `Trainer` APIs, just wondering if it supports/plans to support this new [RLOOTrainer](https://huggingface.co/docs/trl/main/en/rloo_trainer)? (Possibly related: https://github.com/unslothai/unsloth/issues/392)

Many thanks for any help, and this amazing lib!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does unsloth support/plan to support `RLOOTrainer`? #663

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Does unsloth support/plan to support RLOOTrainer? #663

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Does unsloth support/plan to support `RLOOTrainer`? #663