fix: prevent tool call loss from late-arriving names and duplicate finish chunks#2404
fix: prevent tool call loss from late-arriving names and duplicate finish chunks#2404drewd789 wants to merge 2 commits into
Conversation
e7bf0c4 to
55dd9ac
Compare
|
Thanks for the contribution! |
|
@drewd789 Thanks for your contribution! Regarding the issue #2402 , we favor PR #2403 as it is a targeted fix and we can easily verify. Though we agree that the changes in this implementation is more complete with better test coverage. As to other changes, we need an issue so that we can reproduce it to verify. Again, we are hesitate to take any PR targeting core module without a reproducible issue, for stability consideration. Thanks for your understanding. |
Tests for all three tool call loss fixes: **streamingToolCallParser.ts:** - should keep name at correct index when it arrives after JSON - should not reassign index when name arrives after JSON - should handle multiple tool calls with late names **converter.ts:** - should emit tool call immediately when JSON completes during streaming - should not emit duplicate tool calls at finish_reason - should handle multiple finish_reason chunks without duplicates **pipeline.ts:** - should skip finish chunk without tool calls when pending finish has tool calls These tests verify the fixes for: 1. Parser index reassignment when name arrives late 2. Late emission allowing parser state changes to lose tool calls 3. Finish chunk overwriting when API sends finish_reason twice
…nish chunks Fixes three complementary bugs that cause tool calls to disappear during streaming: **streamingToolCallParser.ts:** - Check meta.name when determining if tool call is complete - Prevents index reassignment when name arrives after JSON completion **converter.ts:** - Emit tool calls immediately during streaming when complete + has name - Track emitted tool call IDs in emittedToolCallIds Set - Skip already-emitted IDs at finish_reason to prevent duplicates - Clear emittedToolCallIds in resetStreamingToolCalls() **pipeline.ts:** - Skip finish chunks without tool calls when pending finish already has them - Prevents second finish chunk from overwriting first chunk's tool calls Root causes: 1. Parser considered tool calls "complete" without checking for meta.name 2. Converter only emitted at finish_reason, allowing parser state changes to lose tool calls 3. Pipeline unconditionally stored every finish chunk, overwriting previous ones
55dd9ac to
59048af
Compare
| openaiResponse: OpenAI.Chat.ChatCompletion, | ||
| ): GenerateContentResponse { | ||
| const choice = openaiResponse.choices?.[0]; | ||
| const choice = openaiResponse.choices[0]; |
There was a problem hiding this comment.
[Critical] convertOpenAIResponseToGemini() now unconditionally dereferences openaiResponse.choices[0]. The previous code handled an empty/missing first choice and returned an empty Gemini response, so an OpenAI-compatible response with choices: [] will now throw before usage metadata can be converted.
| const choice = openaiResponse.choices[0]; | |
| const choice = openaiResponse.choices?.[0]; | |
| const response = new GenerateContentResponse(); | |
| if (!choice) { | |
| response.candidates = []; | |
| response.responseId = openaiResponse.id; | |
| response.createTime = openaiResponse.created | |
| ? openaiResponse.created.toString() | |
| : new Date().getTime().toString(); | |
| response.modelVersion = this.model; | |
| response.promptFeedback = { safetyRatings: [] }; | |
| return response; | |
| } |
— gpt-5.5 via Qwen Code /review
| return false; | ||
| } | ||
| collectedGeminiResponses.push(response); | ||
| setPendingFinish(response); |
There was a problem hiding this comment.
[Critical] This branch preserves the earlier finish chunk with tool calls, but it also drops response.usageMetadata from the later duplicate finish chunk. That breaks providers that send tool calls on the first finish chunk and token usage on a later finish/usage chunk.
| setPendingFinish(response); | |
| if (hasPendingFinish && pendingToolCallCount > 0) { | |
| if (response.usageMetadata) { | |
| lastCollected.usageMetadata = response.usageMetadata; | |
| setPendingFinish(lastCollected); | |
| } | |
| return false; | |
| } |
— gpt-5.5 via Qwen Code /review
| if (meta.name) { | ||
| const id = | ||
| meta.id || | ||
| `call_${Date.now()}_${Math.random().toString(36).substring(2, 9)}`; |
There was a problem hiding this comment.
[Critical] addChunk() can remap the provider-supplied index to a different internal parser index, but early emission still reads metadata with the original index. If a provider reuses index 0, this can pair the new arguments with stale metadata, skip the completed call, or emit it again at finish. Please have addChunk() return the resolved internal index or metadata, then use that same key for early emission and duplicate tracking.
— gpt-5.5 via Qwen Code /review
| // Skip if already emitted during streaming (prevents duplicates) | ||
| if (toolCall.id && this.emittedToolCallIds.has(toolCall.id)) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
[Critical] Duplicate suppression only tracks provider IDs, but tool-call IDs are optional in this path. For an id-less call emitted early, no stable emitted key is recorded; when finish_reason arrives, getCompletedToolCalls() can emit the same parser entry again with a different synthetic ID. Track emitted parser/internal indices in addition to provider IDs, or expose a stable parser key for each completed call.
— gpt-5.5 via Qwen Code /review
| if (!buffer.trim() || depth > 0 || !meta?.id) { | ||
| // Tool call is incomplete if: JSON open, buffer empty, missing ID, or missing name | ||
| // Any condition true → index available for reuse | ||
| if (!buffer.trim() || depth > 0 || !meta?.id || !meta.name) { |
There was a problem hiding this comment.
[Critical] Treating a slot with complete JSON and an ID as reusable just because the function name has not arrived yet can corrupt multi-tool streams. If call_1 sends complete arguments before its name, then call_2 starts on the same provider index, findNextAvailableIndex() can return the occupied slot and append/overwrite the first call’s state. Keep such slots occupied for collision allocation; late-name routing should be handled separately.
| if (!buffer.trim() || depth > 0 || !meta?.id || !meta.name) { | |
| if (!buffer.trim() || depth > 0 || !meta?.id) { | |
| return this.nextAvailableIndex; | |
| } |
— gpt-5.5 via Qwen Code /review
Summary
Fixes three complementary bugs that cause tool calls to disappear during streaming:
Root Causes
Bug 1: Parser Index Reassignment (
streamingToolCallParser.ts)findMostRecentIncompleteIndex()andfindNextAvailableIndex()considered tool calls "complete" based on JSON state alone (depth=0, non-empty buffer, has ID), without checking ifmeta.nameexists. When name arrived in a later chunk, parser reassigned it to a new index, leaving JSON at the old index →getCompletedToolCalls()returned empty.Bug 2: Late Emission (
converter.ts)Tool calls were only emitted at
finish_reason, not during streaming. This allowed parser state changes between JSON completion and name arrival to cause tool call loss.Bug 3: Finish Chunk Overwriting (
pipeline.ts)handleChunkMerging()unconditionally stored every finish chunk. Some providers send finish_reason twice (first with tool calls, second with usage metadata only). Second chunk overwrote the first → tool calls lost.Changes
streamingToolCallParser.tsmeta.namewhen determining if tool call is complete infindMostRecentIncompleteIndex()meta.namewhen determining if index is available infindNextAvailableIndex()converter.tsresult.complete && result.value && meta.nameemittedToolCallIdsSetfinish_reasonto prevent duplicatesemittedToolCallIdsinresetStreamingToolCalls()pipeline.tsTesting
Unit tests added for all three fixes:
Parser tests:
Converter tests:
Pipeline tests:
All tests pass, confirming the fixes work correctly.
Related
These fixes work together for defense-in-depth:
Complements existing streaming handling. Works with providers that send chunks in non-standard order or send finish_reason multiple times.