Add async tool-enabled vLLM server for GRPO training via OpenAI-compatible interface#3469
Add async tool-enabled vLLM server for GRPO training via OpenAI-compatible interface#3469BjarniHaukur wants to merge 68 commits into
Conversation
|
thank you for this PR!! |
|
Hi @BjarniHaukur thank you for the PR! We're now looking to integrate environments in TRL, so would you like to rebase your branch on |
|
Hey @BjarniHaukur, thanks for opening this and for the early push toward async, OpenAI-compatible, tool-enabled rollouts. Most of what this PR set out to do has landed independently, just in a different shape:
Going to close this PR (and #3284) as superseded, but want to be clear that "superseded" here means your proposal was correct and the rest of the project caught up to it, not that the idea was wrong. Thanks for being early on this! :) |
What does this PR do?
This PR adds a new
vllm_serve_async.pyscript to TRL. It:vllm_serve.pyvllm.entrypoints.openai.api_serverrollout_funcinterface that lets users define custom input/output structures and tool definitions to forward into reward functionsFixes #3284
Before submitting
Pull Request section?
to it if that's the case.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.