[model-gateway] Add e2e tests of streaming events and tool choice for response api#13880
Conversation
Summary of ChangesHello @XinyueZhang369, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness of the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds valuable end-to-end tests for streaming events and the tool_choice parameter in the response API. The tests are well-structured and cover a good range of scenarios, including different backends and edge cases like mixed tool types. My main feedback is focused on improving the maintainability of the new test file test_tool_choice.py by refactoring duplicated and inconsistent tool definitions into shared constants. This will make the tests cleaner and easier to manage in the future.
key4ng
left a comment
There was a problem hiding this comment.
overall lgtm. noticed the ci running time increased to around 8 min. currently does every time we add a new class it will have to restart the backend?
Sadly yes, I merged mcp , function call and tool choice tests into 1 test class to save some time |
|
Also noticing that some tests like test_basic_function_call, can be a bit flaky, thinking about adding the retry for all responses e2e tests, what do you think? |
|
There is a ci-workflow change. May need @slin1237 's approval |
… response api (sgl-project#13880) Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
… response api (sgl-project#13880) Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
… response api (sgl-project#13880) Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
Motivation
This PR adds more integration test cases for e2e responses API, gRPC backend.
Modifications
Test result
Accuracy Tests
Benchmarking and Profiling
Checklist