Integrate SGLang into OpenRLHF. Non-Hybrid Engine Only#661
Integrate SGLang into OpenRLHF. Non-Hybrid Engine Only#661zhaochenyang20 wants to merge 19 commits intoOpenRLHF:mainfrom
Conversation
There was a problem hiding this comment.
Since @zhaochenyang20 asked me at slack to have a review about this, I very briefly glanced at the code (indeed randomly picked a file, experience_maker.py) but too tired to continue today.
Anyway there is a small nit below, and also a small proposal sgl-project/sglang#2818 that may be hopefully a little bit helpful.
| all_outputs = sum(ray.get(all_output_refs), []) | ||
| assert len(all_outputs) == len(all_prompts) | ||
| pad_token_id, eos_token_id = self.tokenizer.pad_token_id, self.tokenizer.eos_token_id | ||
| try: |
There was a problem hiding this comment.
nit: it may be a bit better if we do not use try-except here because
- if sglang/vllm's format change in the future, non-error may become error and vise versa, and the behavior suddenly change
exceptwithout specifying exception type will cause everything to be caught here, even things like real bugs etc
There was a problem hiding this comment.
I also do not like try except, but do I have other choices? Like seeing the type of the first output?
There was a problem hiding this comment.
I guess a way may be check if config.engine == 'sglang' (and pass around the config to here), or maybe if inference_engine.get_mode() == 'sglang' (and create a method on inference_engine to tell its mode; but this way may have a larger overhead if the engine is on another ray actor).
There was a problem hiding this comment.
Yes, this part should be refactored.
|
great job! |
fix: streamline the OpenRLHF-SGLangf/openrlhf/cli/batch_inference.py
| dummy_strategy.print = print | ||
| dummy_strategy.is_rank_0 = lambda: True | ||
| dummy_strategy.args = args | ||
| strategy = Empty() |
There was a problem hiding this comment.
Just set strategy to None for vllm and sglang
btw, for reward model please use the deepspeed strategy
There was a problem hiding this comment.
I just copied from the main.
|
delete this print |
|
split sglang and vllm into two files and provide different |
|
delete this |
…e_llm_ray_actor funciton.
…ainer/ppo_utils/experience_maker.py
Split LLM ray actor
|
@xiaoxigua999 @hijkzzz This PR has been fully verified regarding accuracy and speed. Is it possible to merge this? |
This PR supports SGLang as an inference engine for the PPO actor and makes relevant changes. Detailed usage can be found in:
https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/OpenRLHF/openrlhf-sglang.md