Supports of SFTTrainer / PPOTrainer / DPOTrainer for LLaVA-alike model

TRL SFTTrainer supports LLaVA (Large Language and Vision Assistant) as described in the following link [Vision Language Models Explained](https://huggingface.co/blog/vlms)

Is there any plan to release PPOTrainer and DPOTrainer for LLaVA? If not, could someone explain the concerns about implementing those trainers or suggest any alternatives? Thanks!