Skip to content

[Bugfix][Frontend] Fix Gemma4 streaming HTML duplication after tool calls#38909

Merged
chaunceyjiang merged 2 commits into
vllm-project:mainfrom
yoke233:fix/gemma4-streaming-html-duplication
Apr 8, 2026
Merged

[Bugfix][Frontend] Fix Gemma4 streaming HTML duplication after tool calls#38909
chaunceyjiang merged 2 commits into
vllm-project:mainfrom
yoke233:fix/gemma4-streaming-html-duplication

Conversation

@yoke233

@yoke233 yoke233 commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Fix a Gemma4 streaming bug where buffered deltas are stitched back into current_text, which can corrupt normal text after or inside tool calls.

This showed up most clearly with HTML content in tool arguments such as write_file(content=...), where tags like <html> and <meta> could be streamed into malformed output such as <<htmlhtml or <<metameta.

Fixes #38910.

Root cause

Gemma4ToolParser.extract_tool_calls_streaming() buffered delta_text to avoid leaking partial special tokens and then rebuilt current_text using:

current_text = previous_text + delta_text

That mixes buffered output state with the original accumulated model text. When a buffered < is replayed into current_text, the parser can produce invalid intermediate states like <<div>, <<html, and <<meta. Those bad intermediate states then interact with the argument diff logic and can duplicate tag names in the final streamed tool arguments.

Fix

Keep current_text from the upstream streaming state and use buffered delta_text only for emission. Do not reconstruct current_text from the buffered delta.

Tests

Why this is not duplicating an existing PR:

  • I searched open PRs/issues for Gemma4 streaming duplication/buffering regressions and did not find an existing open PR covering this fix.

Validation run:

  • Targeted parser verification via uv run --no-project python -
  • Verified standard Gemma4 streaming tool-call argument parsing still produces valid JSON
  • Verified a regression case for plain text after a tool call keeps <div> intact instead of producing <<div> internally
  • Verified a regression case for HTML content inside tool arguments keeps <html> / <meta> intact instead of producing duplicated prefixes such as <<htmlhtml or <<metameta

Note:

  • I also added regression coverage in tests/tool_parsers/test_gemma4_tool_parser.py.
  • I did not run the full pytest target in this workspace because the local test environment is not fully bootstrapped here.

AI disclosure

This PR includes AI-assisted code generation and analysis. I reviewed the changes, reproduced the bug path, and validated the fix myself.

@mergify mergify Bot added tool-calling bug Something isn't working labels Apr 3, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a duplication bug in the Gemma4ToolParser by removing the manual reconstruction of current_text from previous_text and buffered delta_text. This change ensures that characters are not duplicated when a tool call ends or when parsing HTML content within tool arguments. Corresponding unit tests have been added to verify the fix for both plain text following a tool call and HTML arguments. I have no feedback to provide.

@mergify

mergify Bot commented Apr 5, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yoke233.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@epheien

epheien commented Apr 6, 2026

Copy link
Copy Markdown

@yoke233 I simply tested this PR of yours, and it indeed solves the problem. You need to resolve the merge conflicts to proceed with the merge.

@sfeng33 sfeng33 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@sfeng33

sfeng33 commented Apr 6, 2026

Copy link
Copy Markdown
Collaborator

Can you rebase and resolve conflict?

…alls

Stop rebuilding current_text from buffered streaming deltas in the Gemma4 tool parser and add regression coverage for plain text and HTML content after tool calls.

Co-authored-by: OpenAI Codex
Signed-off-by: yoke233 <yoke2012@gmail.com>
@yoke233 yoke233 force-pushed the fix/gemma4-streaming-html-duplication branch from 37c7891 to 0cd2c0a Compare April 7, 2026 02:27
@yoke233

yoke233 commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

Rebased onto the latest main and resolved the merge conflict.

I kept this PR scoped to the original #38909 fix:

  • preserve upstream current_text in Gemma4ToolParser
  • keep the regression tests for plain text / HTML duplication
  • retain the already-landed split-delimiter regression coverage from main

I did not fold in other related fixes here.

@mergify mergify Bot removed the needs-rebase label Apr 7, 2026
@mergify

mergify Bot commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

Hi @yoke233, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Signed-off-by: yoke233 <yoke2012@gmail.com>

@chaunceyjiang chaunceyjiang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

/cc @lucianommartins PTAL.

@lucianommartins

Copy link
Copy Markdown
Contributor

thanks @chaunceyjiang - LGTM, if all tests pass it is go! thanks @yoke233!

@sfeng33

sfeng33 commented Apr 8, 2026

Copy link
Copy Markdown
Collaborator

@chaunceyjiang can we get this merged please

@chaunceyjiang chaunceyjiang merged commit d734445 into vllm-project:main Apr 8, 2026
47 checks passed
mtparet pushed a commit to blackfuel-ai/vllm that referenced this pull request Apr 9, 2026
khluu pushed a commit that referenced this pull request Apr 10, 2026
…alls (#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
(cherry picked from commit d734445)
khluu pushed a commit that referenced this pull request Apr 16, 2026
…alls (#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
(cherry picked from commit d734445)
(cherry picked from commit bae948a)
greg1232 pushed a commit to supermassive-intelligence/vllm-fork that referenced this pull request Apr 22, 2026
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…alls (vllm-project#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
(cherry picked from commit 1cddaca)
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…alls (vllm-project#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
(cherry picked from commit 1cddaca)
(cherry picked from commit b6e6392)
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…alls (vllm-project#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
(cherry picked from commit 87efc0f)
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…alls (vllm-project#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
(cherry picked from commit 87efc0f)
(cherry picked from commit d47d6c6)
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…alls (vllm-project#38909)

Signed-off-by: yoke233 <yoke2012@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: Gemma4 tool parser duplicates HTML tag prefixes in streamed tool arguments

6 participants