fix: don't drop tool call when finish_reason shares the chunk#2992
Conversation
Some OpenAI-compatible providers (e.g. LiteLLM in front of Gemini) pack a
complete tool call and a terminal finish_reason ("stop") into the same
streaming chunk. handleStream evaluated the finish reason before accumulating
the chunk's tool call deltas, so the early return dropped the call, surfacing
upstream as "no response from subagent".
Accumulate tool call deltas before evaluating the finish reason; only
short-circuit the iteration when the chunk carries no terminal finish reason.
Fixes #1544
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The fix is logically correct: accumulating tool-call deltas before evaluating finish_reason directly addresses the LiteLLM/Gemini same-chunk bug. The two new tests cover both the combined-chunk (regression) and the OpenAI-native (guard) paths. One low-severity edge case is noted inline.
| // Short-circuit only when the chunk carried nothing but tool call | ||
| // deltas. If it also carries a terminal finish reason, fall through | ||
| // so the early return below sees the freshly accumulated call. | ||
| if choice.FinishReason == "" { |
There was a problem hiding this comment.
[LOW] Unrecognized finish_reason with tool-call deltas silently falls through
When a provider sends a non-empty finish_reason that is neither stop, length, nor tool_calls (e.g. "content_filter", "max_tokens") in the same chunk as tool-call deltas, the new control flow neither continues (because FinishReason != "") nor hits the early return (because it's not stop/length). The chunk falls through to the providerFinishReason tracking block and then continues into the content-append branches, after which the loop goes on to the next iteration. At end-of-stream the post-loop logic picks up providerFinishReason = "content_filter", which the switch doesn't recognise, likely yielding an incorrect or empty finish reason. The accumulated tool calls would also be lost because the post-loop path for content_filter doesn't inject FinishReasonToolCalls.
This is a low-probability edge case (requires an exotic finish reason and a same-chunk tool call), but the safe pattern would be an else if or a return/continue after the FinishReasonStop/FinishReasonLength block to prevent fall-through for any other non-empty finish reason when tool calls are present.
Fixes #1544
Problem
With OpenAI-compatible proxies like LiteLLM (e.g. in front of Gemini), a subagent's valid
tool_callsresponse is dropped and the host agent reports "no response from subagent" without executing the tool.Root cause
handleStream(pkg/runtime/streaming.go) checked the terminalfinish_reasonbefore accumulating the current chunk's tool call deltas. The OpenAI-native protocol sends them in separate chunks, but LiteLLM/Gemini packs the tool call and the terminalfinish_reasoninto a single chunk. The early return then sawlen(toolCalls) == 0and dropped the call, leaving an empty assistant message.Fix
Accumulate tool call deltas before evaluating the finish reason; only short-circuit the iteration when the chunk has no terminal finish reason.
Tests
TestHandleStream_ToolCallAndStopInSameChunk— regression test (red before, green after).TestHandleStream_ToolCallThenSeparateStop— guards the OpenAI-native path.