Name and Version
Current head in repo
d8794ee (HEAD -> master, origin/master, origin/HEAD) examples: refactor diffusion generation (#22590)
./build/bin/llama-server --version
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 32109 MiB):
Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes, VRAM: 32109 MiB
version: 9022 (d8794ee)
built with GNU 13.3.0 for Linux x86_64
Operating systems
Linux
GGML backends
CUDA
Hardware
GPU : GeForce RTX 5090
CPU : AMD EPYC 9J14
Models
Qwen3.5-35B-A3B-Q6_K.gguf (https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF)
Problem description & steps to reproduce
When Qwen models are used with GitHub CoPilot, sometimes copilot fails to process response.
Recent PRs fixed quite a few issues, this one still occures. Few sample requests are attached from network catpure.
Common pattern seems to be :
model generates a Qwen-style <tool_call>...</tool_call> block, but these are streamed inside delta.reasoning_content instead of converting it to delta.tool_calls.
- stream ends with
data: [DONE]
delta.content is empty
delta.tool_calls is absent
delta.reasoning_content contains a complete <tool_call>...</tool_call>
- final
finish_reason is "stop"
Expected response :
- structured
delta.tool_calls
- final
finish_reason should be "tool_calls"
row_1142_read_file_requirements.request.sanitized.json
row_1230_run_terminal_backend.request.sanitized.json
row_1256_read_file_logger_after_compaction.request.sanitized.json
First Bad Commit
Does not seem to be a regression
Relevant log output
No error in logs , sample output that causes CoPilot to fail :
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"<tool_call>"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"<function=read_file>"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"<parameter=filePath>"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"/tmp/example-workspace/example-project/backend/requirements.txt"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"</tool_call>"}}],...}
data: {"choices":[{"finish_reason":"stop","index":0,"delta":{}}],...}
data: [DONE]
Name and Version
Current head in repo
d8794ee (HEAD -> master, origin/master, origin/HEAD) examples: refactor diffusion generation (#22590)
./build/bin/llama-server --version
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 32109 MiB):
Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes, VRAM: 32109 MiB
version: 9022 (d8794ee)
built with GNU 13.3.0 for Linux x86_64
Operating systems
Linux
GGML backends
CUDA
Hardware
GPU : GeForce RTX 5090
CPU : AMD EPYC 9J14
Models
Qwen3.5-35B-A3B-Q6_K.gguf (https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF)
Problem description & steps to reproduce
When Qwen models are used with GitHub CoPilot, sometimes copilot fails to process response.
Recent PRs fixed quite a few issues, this one still occures. Few sample requests are attached from network catpure.
Common pattern seems to be :
model generates a Qwen-style
<tool_call>...</tool_call>block, but these are streamed insidedelta.reasoning_contentinstead of converting it todelta.tool_calls.data: [DONE]delta.contentis emptydelta.tool_callsis absentdelta.reasoning_contentcontains a complete<tool_call>...</tool_call>finish_reasonis"stop"Expected response :
delta.tool_callsfinish_reasonshould be"tool_calls"row_1142_read_file_requirements.request.sanitized.json
row_1230_run_terminal_backend.request.sanitized.json
row_1256_read_file_logger_after_compaction.request.sanitized.json
First Bad Commit
Does not seem to be a regression
Relevant log output
No error in logs , sample output that causes CoPilot to fail :
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"<tool_call>"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"<function=read_file>"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"<parameter=filePath>"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"/tmp/example-workspace/example-project/backend/requirements.txt"}}],...}
data: {"choices":[{"finish_reason":null,"index":0,"delta":{"reasoning_content":"</tool_call>"}}],...}
data: {"choices":[{"finish_reason":"stop","index":0,"delta":{}}],...}
data: [DONE]