Conversation
There was a problem hiding this comment.
Hello @CatherineSue, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello, I'm Gemini, your friendly code assistant. I've reviewed this pull request by CatherineSue which focuses on improving the tool call detection logic, specifically for the Pythonic format detector. The primary goals are to correctly handle the tool_index when parsing Pythonic tool calls and to fix an issue where the detector failed to find tool calls if they were embedded within other text (mixed output) in non-streaming scenarios. The changes involve modifying the PythonicDetector to use regex search instead of match, allowing it to find tool calls anywhere in the input string, and adjusting how the tool_index is determined for Pythonic calls. Additionally, a helper method in the base detector class was made private.
Highlights
- Pythonic Detector Improvements: The
PythonicDetector'sdetect_and_parsemethod is updated to correctly extract tool calls even when they are surrounded by other text. This is achieved by switching fromregex.match(which only matches from the start of the string) toregex.search(which finds the pattern anywhere). - Tool Index Calculation: For Pythonic tool calls, the
tool_indexin the resultingToolCallItemis now set to the index of the call within the parsed list of calls, rather than attempting to map the function name to an index from the provided tools list. This aligns with how Pythonic tool calls are typically structured as a list of function calls. - Undefined Function Handling: A check is added in the
PythonicDetectorto log a warning if the model attempts to call a function name that is not present in the list of provided tools. - Internal Method Visibility: A helper method
ends_with_partial_tokenin the base detector class is renamed to_ends_with_partial_tokento indicate it's intended for internal use.
Changelog
Click here to see the changelog
- python/sglang/srt/function_call/base_format_detector.py
- Renamed
ends_with_partial_tokenmethod to_ends_with_partial_token(line 75). - Updated calls to the renamed method
_ends_with_partial_token(line 111).
- Renamed
- python/sglang/srt/function_call/pythonic_detector.py
- Changed
regex.matchtoregex.searchinhas_tool_callto find pattern anywhere (line 36). - Refactored
detect_and_parseto useregex.searchto find the tool call part within the text (line 42). - Added logic to extract and preserve text before and after the found tool call part (lines 44-52).
- Modified
ast.parseto parse only the extracted tool call text (line 55). - Added a check to ensure the called function name exists in the provided tools (lines 73-78).
- Updated
ToolCallItemcreation to use the call's index within the list fortool_index(line 84). - Ensured the combined normal text (before and after the tool call) is returned (line 90).
- Changed
- python/sglang/srt/function_call/qwen25_detector.py
- Updated call to the renamed method
_ends_with_partial_token(line 89).
- Updated call to the renamed method
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A list of calls,
In Python's guise,
Parsed from text,
Before your eyes.
No longer matched, but searched.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request aims to fix two main issues: the tool_index in PythonicDetector for streaming mode and the handling of mixed text/tool_call outputs in detect_and_parse for non-streaming scenarios. The changes in PythonicDetector are significant and generally move in the right direction, especially the use of regex.search for more flexible tool call detection and the updated logic for tool_index.
I've identified a critical issue regarding a potential AttributeError if the regex search doesn't find a match, and a couple of medium-severity suggestions related to error handling and test coverage. Addressing the critical issue is essential before merging.
Summary of Findings
- Potential AttributeError in PythonicDetector: A critical issue was found in
python/sglang/srt/function_call/pythonic_detector.pywherematch.start()andmatch.end()could be called on aNoneobject ifself.tool_call_regex.search(text)doesn't find a match. This needs to be addressed to prevent runtime errors. - Normal Text Handling in Error Cases: In
PythonicDetector.detect_and_parse, when a tool call is matched by regex but subsequent AST parsing or other processing fails, the method currently returns the original input string asnormal_text. It's suggested to consider returning the already-parsednormal_text(text surrounding the attempted tool call) to better reflect the parsing progress. - Test Coverage for
detect_and_parse: The PR description mentions fixing issues with mixed outputs indetect_and_parse. While new tests are added, dedicated unit tests forPythonicDetector.detect_and_parsespecifically covering various mixed-content scenarios (e.g., text before tool call, text after, text both before and after, text only, tool call only) would be beneficial to directly validate this aspect of the fix. - Method Renaming for Clarity: The renaming of
ends_with_partial_tokento_ends_with_partial_tokeninbase_format_detector.pyis a good practice, aligning with PEP 8 conventions for internal helper methods. This was a low severity finding and not commented on directly due to review settings. - Tool Index Logic: The change in
PythonicDetectorto setToolCallItem.tool_indextocall_index(the index of the call within the response's list of calls) is a clear and intentional change, aligning with the PR's goal. This was a low severity finding and not commented on directly due to review settings.
Merge Readiness
This pull request makes important improvements to tool call parsing. However, there is a critical issue identified (potential AttributeError) that must be resolved before merging. Additionally, there are a couple of medium-severity suggestions regarding error handling and test coverage that would improve the robustness and maintainability of the code. I recommend addressing these points, especially the critical one. As an AI, I am not authorized to approve pull requests; please ensure further review and approval by team members once the changes are made.
| # Not a pythonic tool call format | ||
| return StreamingParseResult(normal_text=text, calls=[]) | ||
|
|
||
| match = self.tool_call_regex.search(text) |
There was a problem hiding this comment.
The match object, resulting from self.tool_call_regex.search(text), could be None if no tool call pattern is found in the input text. However, the subsequent lines (44-45) attempt to call match.start() and match.end() without first checking if match is None. This will lead to an AttributeError if no match is found.
Could we add a check for match is None immediately after this line and handle the no-match case (e.g., by returning StreamingParseResult(normal_text=text, calls=[])) before attempting to use match?
There was a problem hiding this comment.
Fixed: added a check for match is None
…detector.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…detector.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…xed output in non-streaming (sgl-project#6678)
…xed output in non-streaming (sgl-project#6678)




Motivation
This PR fixes the above issues.
Modifications
tool_indexin theToolCallItemgenerated inPythonicDetectordetect_and_parseChecklist