Improve Qwen3CoderDetector with schema-aware parameter type conversion#13411
Improve Qwen3CoderDetector with schema-aware parameter type conversion#1341100INDEX wants to merge 1 commit into
Conversation
Summary of ChangesHello @00INDEX, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant improvement to the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a schema-aware parameter type conversion function, _convert_param_value, to replace the more simplistic _safe_val. This is a significant improvement for correctly handling parameter types from Qwen3-Coder's function call format. The new implementation is more robust, and the accompanying tests are well-designed to cover important edge cases. My review includes a couple of suggestions to further enhance the robustness and correctness of the new conversion logic, making it even more aligned with its schema-aware objective.
| else: | ||
| if ( | ||
| param_type in ["object", "array", "arr"] | ||
| or param_type.startswith("dict") | ||
| or param_type.startswith("list") | ||
| ): | ||
| try: | ||
| param_value = json.loads(param_value) | ||
| return param_value | ||
| except (json.JSONDecodeError, TypeError, ValueError): | ||
| logger.warning( | ||
| "Parsed value '%s' of parameter '%s' cannot be " | ||
| "parsed with json.loads in tool '%s', will try " | ||
| "other methods to parse it.", | ||
| param_value, | ||
| param_name, | ||
| func_name, | ||
| ) | ||
| try: | ||
| return ast.literal_eval(raw) | ||
| except Exception: | ||
| return raw | ||
| param_value = ast.literal_eval(param_value) # safer | ||
| except (ValueError, SyntaxError, TypeError): | ||
| logger.warning( | ||
| "Parsed value '%s' of parameter '%s' cannot be " | ||
| "converted via Python `ast.literal_eval()` in tool " | ||
| "'%s', degenerating to string.", | ||
| param_value, | ||
| param_name, | ||
| func_name, | ||
| ) | ||
| return param_value |
There was a problem hiding this comment.
The current logic for handling unknown parameter types falls back to using ast.literal_eval. This can lead to unexpected type conversions, which goes against the goal of this PR to be more schema-aware. For instance, a parameter with an unknown type and a string value of "123" would be incorrectly converted to an integer. It would be safer to only return the value as a string for any unrecognized types. The ast.literal_eval fallback should be reserved for types explicitly declared as object or array when json.loads fails.
elif param_type in ["object", "array", "arr"] or param_type.startswith(("dict", "list")):
try:
return json.loads(param_value)
except (json.JSONDecodeError, TypeError, ValueError):
logger.warning(
"Parsed value '%s' of parameter '%s' cannot be "
"parsed with json.loads in tool '%s', will try "
"other methods to parse it.",
param_value,
param_name,
func_name,
)
try:
return ast.literal_eval(param_value)
except (ValueError, SyntaxError, TypeError):
logger.warning(
"Parsed value '%s' of parameter '%s' cannot be "
"converted via Python `ast.literal_eval()` in tool "
"'%s', degenerating to string.",
param_value,
param_name,
func_name,
)
return param_value
else:
return param_valueThere was a problem hiding this comment.
can we take a look at this comment?
There was a problem hiding this comment.
This maintains the original fallback logic in _safe_val version: it attempts to parse using json.loads first, falls back to ast.literal_eval on failure, and returns the raw string if both attempts fail.
There was a problem hiding this comment.
Other than that, it follows the standard vLLM and Qwen3 Coder logic, so I don't think we need to add anything else.
| return ( | ||
| float_param_value | ||
| if float_param_value - int(float_param_value) != 0 | ||
| else int(float_param_value) | ||
| ) |
There was a problem hiding this comment.
The current logic to check if a float can be converted to an integer, float_param_value - int(float_param_value) != 0, works for many cases but is not the most idiomatic or robust approach. Using the float.is_integer() method is cleaner, more readable, and safer against potential floating-point precision issues.
if float_param_value.is_integer():
return int(float_param_value)
return float_param_value
CatherineSue
left a comment
There was a problem hiding this comment.
overall I think this is a great improvement, thanks!
| else: | ||
| if ( | ||
| param_type in ["object", "array", "arr"] | ||
| or param_type.startswith("dict") | ||
| or param_type.startswith("list") | ||
| ): | ||
| try: | ||
| param_value = json.loads(param_value) | ||
| return param_value | ||
| except (json.JSONDecodeError, TypeError, ValueError): | ||
| logger.warning( | ||
| "Parsed value '%s' of parameter '%s' cannot be " | ||
| "parsed with json.loads in tool '%s', will try " | ||
| "other methods to parse it.", | ||
| param_value, | ||
| param_name, | ||
| func_name, | ||
| ) | ||
| try: | ||
| return ast.literal_eval(raw) | ||
| except Exception: | ||
| return raw | ||
| param_value = ast.literal_eval(param_value) # safer | ||
| except (ValueError, SyntaxError, TypeError): | ||
| logger.warning( | ||
| "Parsed value '%s' of parameter '%s' cannot be " | ||
| "converted via Python `ast.literal_eval()` in tool " | ||
| "'%s', degenerating to string.", | ||
| param_value, | ||
| param_name, | ||
| func_name, | ||
| ) | ||
| return param_value |
There was a problem hiding this comment.
can we take a look at this comment?
|
it seems the commit history is messed up? |
Thanks for catching that. I'll sort it out. |
c05117f to
9427d99
Compare
|
Hi @CatherineSue, is this PR good to go? |
|
Hi @00INDEX, thanks for the contribution. Schema-aware parameter type conversion for Qwen3CoderDetector has since landed on main via #16744 (see |
Motivation
In the original implementation of Qwen3CoderDetector, there is a function called
_safe_valto help convert the parameter type of function calls:However, due to the unique structure of Qwen3-Coder’s function call format, it is challenging to accurately infer a parameter’s type without introducing explicit tool schemas:
This imposes an additional burden on tool design: although the parameter types are correctly defined, the inference engine often produces completely incorrect type outputs.
Modifications
Therefore, I used
_convert_param_value(adapted from vllm-project/vllm) instead of the simplistic_safe_val, introducing a utility schema that enables more precise parameter type conversion:As you can see, besides the
param_valueargument (the raw input used in_safe_val), the function now also acceptsparam_configto determine the parameter type. To make this work, I added a_tool_parameter_configsvariable in two parts of the code — a dictionary mapping each tool name to its parameter schema.Accuracy Tests
No changes to models outputs
Benchmarking and Profiling
No impacts on inference speed
Checklist