Skip to content

Integrate SGLang into OpenRLHF. Non-Hybrid Engine Only#661

Open
zhaochenyang20 wants to merge 19 commits intoOpenRLHF:mainfrom
zhaochenyang20:dev_pr
Open

Integrate SGLang into OpenRLHF. Non-Hybrid Engine Only#661
zhaochenyang20 wants to merge 19 commits intoOpenRLHF:mainfrom
zhaochenyang20:dev_pr

Conversation

@zhaochenyang20
Copy link
Copy Markdown

This PR supports SGLang as an inference engine for the PPO actor and makes relevant changes. Detailed usage can be found in:

https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial/blob/main/rlhf/OpenRLHF/openrlhf-sglang.md

Copy link
Copy Markdown

@fzyzcjy fzyzcjy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since @zhaochenyang20 asked me at slack to have a review about this, I very briefly glanced at the code (indeed randomly picked a file, experience_maker.py) but too tired to continue today.

Anyway there is a small nit below, and also a small proposal sgl-project/sglang#2818 that may be hopefully a little bit helpful.

all_outputs = sum(ray.get(all_output_refs), [])
assert len(all_outputs) == len(all_prompts)
pad_token_id, eos_token_id = self.tokenizer.pad_token_id, self.tokenizer.eos_token_id
try:
Copy link
Copy Markdown

@fzyzcjy fzyzcjy Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it may be a bit better if we do not use try-except here because

  • if sglang/vllm's format change in the future, non-error may become error and vise versa, and the behavior suddenly change
  • except without specifying exception type will cause everything to be caught here, even things like real bugs etc

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also do not like try except, but do I have other choices? Like seeing the type of the first output?

Copy link
Copy Markdown

@fzyzcjy fzyzcjy Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a way may be check if config.engine == 'sglang' (and pass around the config to here), or maybe if inference_engine.get_mode() == 'sglang' (and create a method on inference_engine to tell its mode; but this way may have a larger overhead if the engine is on another ray actor).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this part should be refactored.

@catqaq
Copy link
Copy Markdown
Collaborator

catqaq commented Jan 12, 2025

great job!

dummy_strategy.print = print
dummy_strategy.is_rank_0 = lambda: True
dummy_strategy.args = args
strategy = Empty()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just set strategy to None for vllm and sglang
btw, for reward model please use the deepspeed strategy

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied from the main.

@zhaochenyang20
Copy link
Copy Markdown
Author

delete this print print("os.environ['LOCAL_RANK']", os.environ["LOCAL_RANK"])

@zhaochenyang20
Copy link
Copy Markdown
Author

split sglang and vllm into two files and provide different create_inference_engines funciton.

@zhaochenyang20
Copy link
Copy Markdown
Author

delete this torch.cuda.synchronize()

@zhaochenyang20 zhaochenyang20 changed the title [WIP] Integrate SGLang into OpenRLHF Integrate SGLang into OpenRLHF. Non-Hybrid Engine Only Jan 28, 2025
@merrymercy
Copy link
Copy Markdown

@xiaoxigua999 @hijkzzz This PR has been fully verified regarding accuracy and speed. Is it possible to merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants