Skip to content

[Refactor] Consolidate required/named tool_choice streaming into DelegatingParser#41876

Merged
vllm-bot merged 1 commit into
vllm-project:mainfrom
sfeng33:migrate-chat
May 7, 2026
Merged

[Refactor] Consolidate required/named tool_choice streaming into DelegatingParser#41876
vllm-bot merged 1 commit into
vllm-project:mainfrom
sfeng33:migrate-chat

Conversation

@sfeng33

@sfeng33 sfeng33 commented May 6, 2026

Copy link
Copy Markdown
Collaborator

Purpose

For chat completion api, migrate required/named tool_choice streaming to use parse_delta in the unified parser.

Test Plan

Scenario tool_choice Result
Required streaming "required" Name + id emitted once, args streamed incrementally, finish_reason="tool_calls"
Named streaming {"type":"function","function":{"name":"get_weather"}} Name + id emitted once on first tool_calls chunk, args streamed incrementally
Auto streaming (regression) "auto" Tool call streamed normally via existing auto path
Server:
vllm serve Qwen/Qwen3-0.6B \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --chat-template examples/tool_chat_template_hermes.jinja

Sample request body:
{
  "model": "Qwen/Qwen3-0.6B",
  "stream": true,
  "tool_choice": "required",
  "messages": [{"role":"user","content":"What is the weather in Paris?"}],
  "tools": [{"type":"function","function":{
    "name":"get_weather","description":"Get weather",
    "parameters":{"type":"object",
      "properties":{"city":{"type":"string","description":"city name"}},
      "required":["city"]}
  }}]
}

Signed-off-by: sfeng33 <4florafeng@gmail.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the tool call streaming logic by consolidating the handling of 'required' and 'named' tool choices into the abstract parser. It removes redundant extraction logic from the OpenAIServingChat class and ensures that the parser state is correctly initialized with tool call metadata. A potential issue was identified in how the tool call counter is incremented, as the current implementation only checks the first tool call in a delta message, which could lead to inaccuracies if multiple tool calls are initiated simultaneously.

Comment thread vllm/parser/abstract_parser.py
@sfeng33 sfeng33 marked this pull request as ready for review May 6, 2026 22:41

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@chaunceyjiang chaunceyjiang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

One small question: why does your test command explicitly specify an additional chat template?

@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label May 7, 2026
@sfeng33

sfeng33 commented May 7, 2026

Copy link
Copy Markdown
Collaborator Author

One small question: why does your test command explicitly specify an additional chat template?

Oh it works without specifying the chat template, the agent picked it up this time randomly

@vllm-bot vllm-bot merged commit 8eb4011 into vllm-project:main May 7, 2026
53 of 56 checks passed
@sfeng33 sfeng33 deleted the migrate-chat branch May 7, 2026 16:52
alexagriffith pushed a commit to alexagriffith/vllm that referenced this pull request May 7, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: alexagriffith <agriffith96@gmail.com>
libinta pushed a commit to libinta/vllm that referenced this pull request May 8, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: Libin Tang <libin.tang@intel.com>
ExtReMLapin pushed a commit to ExtReMLapin/vllm that referenced this pull request May 11, 2026
Resolve conflict in vllm/parser/abstract_parser.py: keep both adjacent
blocks added after the tool_call streaming extraction. The reasoning
preservation block (HEAD) and the history_tool_call_cnt increment block
(main, PR vllm-project#41876) operate on orthogonal fields of delta_message/state
and are safe to apply in sequence.

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>
weifang231 pushed a commit to weifang231/eb-vllm that referenced this pull request May 13, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
…gatingParser (vllm-project#41876)

Signed-off-by: sfeng33 <4florafeng@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants