fix: preserve native tool call ID in multi-turn tool calling#32768
Conversation
There was a problem hiding this comment.
Code Review
The pull request successfully addresses the issue of preserving native tool call IDs in multi-turn tool calling scenarios, particularly for models like Kimi K2. The addition of the id field to FunctionCall and its preservation in _parse_tool_calls_from_content() are crucial improvements. The .strip() calls in kimi_k2_tool_parser.py also contribute to cleaner data handling. However, there's a potential issue in chat_completion_full_generator() where make_tool_call_id() might not generate IDs in the expected format for Kimi K2 models if the original tc.id is None.
2ba5175 to
c21c48b
Compare
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
c21c48b to
f5c8c24
Compare
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
f5c8c24 to
268ff59
Compare
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
Note: The mypy error at line 1736 is a pre-existing issue in the main branch (line 1712 in main). This PR does not introduce any new type errors. |
daniel-salib
left a comment
There was a problem hiding this comment.
thanks for the fix! I tried the fix locally and it fixes the issue I've been having with Kimi k2 tool parsing
| elif ( | ||
| request.tool_choice | ||
| and type(request.tool_choice) is ChatCompletionNamedToolChoiceParam | ||
| ): | ||
| assert tool_calls is not None and len(tool_calls) > 0 | ||
| tool_call_class_items = [] | ||
| for tc in tool_calls: | ||
| tool_call_class_items.append( |
There was a problem hiding this comment.
would we need to apply the same pattern in responses/serving.py to support responses API?
| tool_call_class_items.append( | ||
| tool_call_class( | ||
| id=tc.id | ||
| if tc.id |
There was a problem hiding this comment.
if tc.id is an empty string do we still want o call make_tool_call_id?
There was a problem hiding this comment.
Thanks, applying the same pattern to responses/serving.py makes sense for consistency. And the if tc.id check handles empty strings correctly by falling back to make_tool_call_id.
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
qandrew
left a comment
There was a problem hiding this comment.
thanks for putting this together! can you write a unit test to preserve behavior?
I considered adding unit tests but realized the cost-benefit doesn't favor it here: |
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
7009a7a to
17c8fb7
Compare
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
17c8fb7 to
b3435a8
Compare
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
1 similar comment
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
b1cab5a to
5508de7
Compare
|
Hi @wangln19, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn>
Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Roger Wang <hey@rogerw.io>
|
@cursor review |
| tool_call_class_items.append( | ||
| tool_call_class(id=generated_id, function=tc) | ||
| ) | ||
| history_tool_call_cnt += 1 |
There was a problem hiding this comment.
Double-counting causes non-sequential tool call indices
Medium Severity
The tool call index calculation uses history_tool_call_cnt + idx where idx comes from enumerate(), but history_tool_call_cnt is also incremented inside the loop. This causes indices to skip values. For example, with 3 tool calls starting at history count 5, the indices would be 5, 7, 9 instead of the expected 5, 6, 7. For Kimi K2, this produces IDs like functions.get_weather:5, functions.get_weather:7, functions.get_weather:9 breaking the sequential indexing the model expects.
Additional Locations (2)
Co-authored-by: Isotr0py <2037008807@qq.com> Signed-off-by: Roger Wang <hey@rogerw.io>
|
@cursor review |
…oject#32768) Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Isotr0py <2037008807@qq.com>
Audited recent tool parser bug-fix PRs and found that several landed without corresponding test coverage. Added unit tests for each fix to prevent regressions. - Mistral: fast detokenization text detection (PR vllm-project#37209) - Qwen3Coder: malformed XML crash, anyOf double-encoding, speculative decode streaming (PRs vllm-project#36774, vllm-project#36032, vllm-project#35615) - DeepSeekV32: delimiter preservation with fast detokenization, skip_special_tokens adjustment (PR vllm-project#33964) - GLM-4 MoE: zero-argument tool calls, transformers 5.x delimiter handling, Unicode character preservation (PRs vllm-project#32321, vllm-project#31622, vllm-project#30920) - MiniMax M2: anyOf nullable parameter handling for non-null and null values (PR vllm-project#32342) - Step3p5: MTP-style variable-chunk and multi-token streaming (PR vllm-project#33690) - Kimi K2: native tool call ID extraction and multi-turn ID continuity (PR vllm-project#32768) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ben Browning <bbrownin@redhat.com>
…oject#32768) Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Isotr0py <2037008807@qq.com>
…oject#32768) Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Isotr0py <2037008807@qq.com>
…oject#32768) Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Isotr0py <2037008807@qq.com>
…oject#32768) Signed-off-by: wanglinian <wanglinian@stu.pku.edu.cn> Signed-off-by: wangln19 <96399074+wangln19@users.noreply.github.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io> Co-authored-by: Isotr0py <2037008807@qq.com>
In multi-turn tool calling scenarios, models like Kimi K2 generate tool calls with specific ID formats (e.g., 'functions.get_weather:0'). The model expects to see these IDs in subsequent tool results to match them correctly.
Previously, _parse_tool_calls_from_content() was discarding the native tool call ID parsed by the tool parser, and the serving layer was generating random IDs instead. This broke multi-turn tool calling for models that rely on consistent tool call IDs.
This fix:
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.