Skip to content

fix(llm): strip dangling XML tool call closing tags from text content#27984

Open
sghaskell wants to merge 1 commit into
anomalyco:devfrom
sghaskell:fix/strip-dangling-xml-tags
Open

fix(llm): strip dangling XML tool call closing tags from text content#27984
sghaskell wants to merge 1 commit into
anomalyco:devfrom
sghaskell:fix/strip-dangling-xml-tags

Conversation

@sghaskell

@sghaskell sghaskell commented May 17, 2026

Copy link
Copy Markdown

Issue for this PR

Closes #24316

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Qwen3 via vLLM/llama.cpp with the hermes tool call parser occasionally appends raw XML tool call closing tags (</parameter>, </function>, </parameter, ) as a response termination artifact after natural language text. These leak into the assistant message content and appear as naked XML in the TUI, halting progress.

This adds a stripDanglingXmlArtifacts() helper that strips trailing XML artifacts from streaming text deltas at the step() level in openai-chat.ts before content reaches the consumer. Only trailing artifacts are stripped; mid-text occurrences are preserved.

I understand why this works: the artifacts always appear at the end of a streaming delta chunk, so iterating and stripping trailing matches is sufficient without risking damage to legitimate content.

How did you verify your code works?

  • bun test in packages/llm — 16/16 pass
  • bun typecheck — clean
  • Stress tested with multi-tool-call prompts (5-10 iterations each) against Qwen3/vLLM — no XML leakage observed

Screenshots / recordings

N/A — this is a backend streaming fix.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@github-actions github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. and removed needs:compliance This means the issue will auto-close after 2 hours. labels May 17, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@sghaskell

Copy link
Copy Markdown
Author

PR updated to follow the required template. Body now includes proper issue reference, checkboxes, and explanation of why the fix works.

@nunodonato

Copy link
Copy Markdown

THANK YOU

@pebeto

pebeto commented May 21, 2026

Copy link
Copy Markdown

Tested against an over-complex web-based project with 103037 tokens used in Qwen 3.6 27B NVFP4 without any pause. Thanks for the fix.

@danielcherubini

Copy link
Copy Markdown

Works.. please opencode team.. pull this in

@amahteletrac

Copy link
Copy Markdown

This did not fix the issue for me:
image

@pebeto

pebeto commented Jun 3, 2026

Copy link
Copy Markdown

After some testing, I'm reaching the same point as @amahteletrac. I'm proposing a proper working solution in #30633.

@amahteletrac

Copy link
Copy Markdown

I am not able to reproduce with gemma4:31b-nvfp4 by the way, which makes me think there is no "proper solution" here. It's caused by the model.

@amahteletrac

Copy link
Copy Markdown

A tad bit slower, but in my personal experience incredibly capable local model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Progress halts with qwen 3.6 35b-a3b with naked tool call in the console

5 participants