Skip to content

fix: don't drop tool call when finish_reason shares the chunk#2992

Merged
dgageot merged 1 commit into
mainfrom
fix/streaming-toolcall-stop-same-chunk
Jun 3, 2026
Merged

fix: don't drop tool call when finish_reason shares the chunk#2992
dgageot merged 1 commit into
mainfrom
fix/streaming-toolcall-stop-same-chunk

Conversation

@Sayt-0

@Sayt-0 Sayt-0 commented Jun 3, 2026

Copy link
Copy Markdown
Member

Fixes #1544

Problem

With OpenAI-compatible proxies like LiteLLM (e.g. in front of Gemini), a subagent's valid tool_calls response is dropped and the host agent reports "no response from subagent" without executing the tool.

Root cause

handleStream (pkg/runtime/streaming.go) checked the terminal finish_reason before accumulating the current chunk's tool call deltas. The OpenAI-native protocol sends them in separate chunks, but LiteLLM/Gemini packs the tool call and the terminal finish_reason into a single chunk. The early return then saw len(toolCalls) == 0 and dropped the call, leaving an empty assistant message.

Fix

Accumulate tool call deltas before evaluating the finish reason; only short-circuit the iteration when the chunk has no terminal finish reason.

Tests

  • TestHandleStream_ToolCallAndStopInSameChunk — regression test (red before, green after).
  • TestHandleStream_ToolCallThenSeparateStop — guards the OpenAI-native path.

Some OpenAI-compatible providers (e.g. LiteLLM in front of Gemini) pack a
complete tool call and a terminal finish_reason ("stop") into the same
streaming chunk. handleStream evaluated the finish reason before accumulating
the chunk's tool call deltas, so the early return dropped the call, surfacing
upstream as "no response from subagent".

Accumulate tool call deltas before evaluating the finish reason; only
short-circuit the iteration when the chunk carries no terminal finish reason.

Fixes #1544
@Sayt-0 Sayt-0 requested a review from a team as a code owner June 3, 2026 14:06

@docker-agent docker-agent left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The fix is logically correct: accumulating tool-call deltas before evaluating finish_reason directly addresses the LiteLLM/Gemini same-chunk bug. The two new tests cover both the combined-chunk (regression) and the OpenAI-native (guard) paths. One low-severity edge case is noted inline.

Comment thread pkg/runtime/streaming.go
// Short-circuit only when the chunk carried nothing but tool call
// deltas. If it also carries a terminal finish reason, fall through
// so the early return below sees the freshly accumulated call.
if choice.FinishReason == "" {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Unrecognized finish_reason with tool-call deltas silently falls through

When a provider sends a non-empty finish_reason that is neither stop, length, nor tool_calls (e.g. "content_filter", "max_tokens") in the same chunk as tool-call deltas, the new control flow neither continues (because FinishReason != "") nor hits the early return (because it's not stop/length). The chunk falls through to the providerFinishReason tracking block and then continues into the content-append branches, after which the loop goes on to the next iteration. At end-of-stream the post-loop logic picks up providerFinishReason = "content_filter", which the switch doesn't recognise, likely yielding an incorrect or empty finish reason. The accumulated tool calls would also be lost because the post-loop path for content_filter doesn't inject FinishReasonToolCalls.

This is a low-probability edge case (requires an exotic finish reason and a same-chunk tool call), but the safe pattern would be an else if or a return/continue after the FinishReasonStop/FinishReasonLength block to prevent fall-through for any other non-empty finish reason when tool calls are present.

@dgageot dgageot merged commit fefa34b into main Jun 3, 2026
13 checks passed
@Sayt-0 Sayt-0 deleted the fix/streaming-toolcall-stop-same-chunk branch June 3, 2026 16:32
@aheritier aheritier added area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) kind/fix PR fixes a bug (maps to fix: commit prefix) labels Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) kind/fix PR fixes a bug (maps to fix: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subagent tool_calls not handled when using LiteLLM proxy

4 participants