[Feature][Responses API]Support MCP tools with streaming mode + background mode#23927
Conversation
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
|
cc @heheda12345 |
Signed-off-by: wuhang <wuhang6@huawei.com>
heheda12345
left a comment
There was a problem hiding this comment.
Thanks for your contribution. I left some comments. And can you add a test in https://github.com/vllm-project/vllm/blob/main/tests/entrypoints/openai/test_response_api_with_harmony.py?
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds support for MCP tools with streaming and background modes in the Responses API. The changes involve updating the retrieve_responses API endpoint to handle streaming and introducing new logic to manage background streaming requests.
My review focuses on two main points:
- A potential deadlock in the new background stream generator due to a race condition. I've provided a suggestion to fix this.
- A memory leak in the new
event_store, which could be problematic in production.
Overall, the changes look good and align with the feature goal, but these two issues should be addressed.
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: wuhang <wuhang6@huawei.com>
heheda12345
left a comment
There was a problem hiding this comment.
LGTM! Thanks for the great job!
…round mode (vllm-project#23927) Signed-off-by: wuhang <wuhang6@huawei.com>
…round mode (vllm-project#23927) Signed-off-by: wuhang <wuhang6@huawei.com>
…round mode (vllm-project#23927) Signed-off-by: wuhang <wuhang6@huawei.com>
Purpose
Fix #23295 especially for streaming response
Test Plan
Test Result
streaming response
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.