Skip to content

Improve Qwen3CoderDetector with schema-aware parameter type conversion#13411

Closed
00INDEX wants to merge 1 commit into
sgl-project:mainfrom
00INDEX:improve_qwen3_coder_detector_type_conversion
Closed

Improve Qwen3CoderDetector with schema-aware parameter type conversion#13411
00INDEX wants to merge 1 commit into
sgl-project:mainfrom
00INDEX:improve_qwen3_coder_detector_type_conversion

Conversation

@00INDEX

@00INDEX 00INDEX commented Nov 17, 2025

Copy link
Copy Markdown

Motivation

In the original implementation of Qwen3CoderDetector, there is a function called _safe_val to help convert the parameter type of function calls:

def _safe_val(raw: str) -> Any:
    raw = html.unescape(raw.strip())
    try:
        return json.loads(raw)
    except Exception:
        try:
            return ast.literal_eval(raw)
        except Exception:
            return raw

However, due to the unique structure of Qwen3-Coder’s function call format, it is challenging to accurately infer a parameter’s type without introducing explicit tool schemas:

<tool_call>
<function=Personal_Information_Registration>
<parameter=name>
Cooper Freeman     # definitely a string
</parameter>
<parameter=age>
32                 # definitely an integer
</parameter>
<parameter=zip_code>
03106              # a string, or an integer?
</parameter>
</function>
</tool_call>

This imposes an additional burden on tool design: although the parameter types are correctly defined, the inference engine often produces completely incorrect type outputs.

Modifications

Therefore, I used _convert_param_value (adapted from vllm-project/vllm) instead of the simplistic _safe_val, introducing a utility schema that enables more precise parameter type conversion:

def _convert_param_value(
    param_value: str, param_name: str, param_config: dict, func_name: str
) -> Any:
    ...

As you can see, besides the param_value argument (the raw input used in _safe_val), the function now also accepts param_config to determine the parameter type. To make this work, I added a _tool_parameter_configs variable in two parts of the code — a dictionary mapping each tool name to its parameter schema.

_tool_parameter_configs = {
    "tool_name": {
        "param1": {
            "type": "string",
            ...
        },
        "param2": {
            "type": "int",
            ...
        }
    }
}

Accuracy Tests

No changes to models outputs

Benchmarking and Profiling

No impacts on inference speed

Checklist

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @00INDEX, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant improvement to the Qwen3CoderDetector by implementing schema-aware parameter type conversion. The previous method often struggled with correctly inferring parameter types from Qwen3-Coder's unique format, leading to potential misinterpretations. The new approach integrates explicit tool schemas to precisely guide the conversion of parameter values, ensuring that data types are accurately preserved as defined in the tool's specification. This enhancement boosts the reliability and correctness of function call parsing within the system.

Highlights

  • Enhanced Parameter Type Conversion: Replaced the generic _safe_val function with a more sophisticated _convert_param_value function. This new function leverages explicit schema information to accurately convert parameter values, addressing ambiguities in Qwen3-Coder's function call format.
  • Schema Integration: Introduced _tool_parameter_configs to store the schema for each tool's parameters. This configuration is passed to _convert_param_value to guide the type conversion process, ensuring values like "03106" are correctly identified as strings or integers based on the schema.
  • Improved Robustness: The new conversion logic handles various types (string, integer, float, boolean, object/array) and includes logging for cases where type conversion fails or a parameter is not defined in the schema, gracefully falling back to string representation.
  • Updated Unit Tests: Added comprehensive unit tests to verify the correct behavior of the schema-aware type conversion, specifically for cases where string parameters might contain values that could be misinterpreted as other types (e.g., "42" as a string vs. an integer).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a schema-aware parameter type conversion function, _convert_param_value, to replace the more simplistic _safe_val. This is a significant improvement for correctly handling parameter types from Qwen3-Coder's function call format. The new implementation is more robust, and the accompanying tests are well-designed to cover important edge cases. My review includes a couple of suggestions to further enhance the robustness and correctness of the new conversion logic, making it even more aligned with its schema-aware objective.

Comment on lines +102 to +131
else:
if (
param_type in ["object", "array", "arr"]
or param_type.startswith("dict")
or param_type.startswith("list")
):
try:
param_value = json.loads(param_value)
return param_value
except (json.JSONDecodeError, TypeError, ValueError):
logger.warning(
"Parsed value '%s' of parameter '%s' cannot be "
"parsed with json.loads in tool '%s', will try "
"other methods to parse it.",
param_value,
param_name,
func_name,
)
try:
return ast.literal_eval(raw)
except Exception:
return raw
param_value = ast.literal_eval(param_value) # safer
except (ValueError, SyntaxError, TypeError):
logger.warning(
"Parsed value '%s' of parameter '%s' cannot be "
"converted via Python `ast.literal_eval()` in tool "
"'%s', degenerating to string.",
param_value,
param_name,
func_name,
)
return param_value

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current logic for handling unknown parameter types falls back to using ast.literal_eval. This can lead to unexpected type conversions, which goes against the goal of this PR to be more schema-aware. For instance, a parameter with an unknown type and a string value of "123" would be incorrectly converted to an integer. It would be safer to only return the value as a string for any unrecognized types. The ast.literal_eval fallback should be reserved for types explicitly declared as object or array when json.loads fails.

    elif param_type in ["object", "array", "arr"] or param_type.startswith(("dict", "list")):
        try:
            return json.loads(param_value)
        except (json.JSONDecodeError, TypeError, ValueError):
            logger.warning(
                "Parsed value '%s' of parameter '%s' cannot be "
                "parsed with json.loads in tool '%s', will try "
                "other methods to parse it.",
                param_value,
                param_name,
                func_name,
            )
        try:
            return ast.literal_eval(param_value)
        except (ValueError, SyntaxError, TypeError):
            logger.warning(
                "Parsed value '%s' of parameter '%s' cannot be "
                "converted via Python `ast.literal_eval()` in tool "
                "'%s', degenerating to string.",
                param_value,
                param_name,
                func_name,
            )
            return param_value
    else:
        return param_value

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we take a look at this comment?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maintains the original fallback logic in _safe_val version: it attempts to parse using json.loads first, falls back to ast.literal_eval on failure, and returns the raw string if both attempts fail.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than that, it follows the standard vLLM and Qwen3 Coder logic, so I don't think we need to add anything else.

Comment on lines +76 to +80
return (
float_param_value
if float_param_value - int(float_param_value) != 0
else int(float_param_value)
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current logic to check if a float can be converted to an integer, float_param_value - int(float_param_value) != 0, works for many cases but is not the most idiomatic or robust approach. Using the float.is_integer() method is cleaner, more readable, and safer against potential floating-point precision issues.

            if float_param_value.is_integer():
                return int(float_param_value)
            return float_param_value

@CatherineSue CatherineSue left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall I think this is a great improvement, thanks!

Comment on lines +102 to +131
else:
if (
param_type in ["object", "array", "arr"]
or param_type.startswith("dict")
or param_type.startswith("list")
):
try:
param_value = json.loads(param_value)
return param_value
except (json.JSONDecodeError, TypeError, ValueError):
logger.warning(
"Parsed value '%s' of parameter '%s' cannot be "
"parsed with json.loads in tool '%s', will try "
"other methods to parse it.",
param_value,
param_name,
func_name,
)
try:
return ast.literal_eval(raw)
except Exception:
return raw
param_value = ast.literal_eval(param_value) # safer
except (ValueError, SyntaxError, TypeError):
logger.warning(
"Parsed value '%s' of parameter '%s' cannot be "
"converted via Python `ast.literal_eval()` in tool "
"'%s', degenerating to string.",
param_value,
param_name,
func_name,
)
return param_value

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we take a look at this comment?

@CatherineSue

Copy link
Copy Markdown
Collaborator

it seems the commit history is messed up?

@00INDEX

00INDEX commented Nov 19, 2025

Copy link
Copy Markdown
Author

it seems the commit history is messed up?

Thanks for catching that. I'll sort it out.

@00INDEX 00INDEX force-pushed the improve_qwen3_coder_detector_type_conversion branch from c05117f to 9427d99 Compare November 19, 2025 01:51
@00INDEX

00INDEX commented Nov 23, 2025

Copy link
Copy Markdown
Author

Hi @CatherineSue, is this PR good to go?

@hnyls2002

Copy link
Copy Markdown
Collaborator

Hi @00INDEX, thanks for the contribution. Schema-aware parameter type conversion for Qwen3CoderDetector has since landed on main via #16744 (see _convert_param_value in python/sglang/srt/function_call/qwen3_coder_detector.py), so this PR is superseded. Closing as part of backlog cleanup - please reopen if I've missed something.

@hnyls2002 hnyls2002 closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants