π fix: stream tool call arguments incrementally in Response API#13506
π fix: stream tool call arguments incrementally in Response API#13506
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@nekomeowww - This is a backend API fix for Response API streaming (tool call arguments). Please take a look. |
Codecov Reportβ
All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## canary #13506 +/- ##
========================================
Coverage 66.43% 66.43%
========================================
Files 1976 1976
Lines 163601 163601
Branches 18709 19473 +764
========================================
Hits 108695 108695
Misses 54784 54784
Partials 122 122
Flags with carried forward coverage won't be shown. Click here to find out more.
π New features to boost your workflow:
|
There was a problem hiding this comment.
π‘ Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 179a8eae8e
βΉοΈ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with π.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The tool_calling stream chunks contain accumulated arguments (not deltas), but the Response API was treating each chunk as a complete independent output_item β creating a new lifecycle (added β delta β done) per token and incrementing output_index to 90+. Fix: track active tool calls by call_id and compute true incremental deltas by slicing off previously-seen content. Each tool call now gets a single stable output_item with proper streaming deltas, finalized only when the stream ends or tool execution begins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When call_llm retries after a failed attempt, activeToolCalls may contain entries from the failed stream that never received a tool_end. Without clearing, finishActiveToolCalls would emit phantom function_call done events and misalign output_index for the successful attempt. Reset the map on stream_retry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
216a2d3 to
eef3e86
Compare
# π release: 20260407 This release includes **148 commits**. Key updates are below. - **Response API tool execution is more capable and reliable** β Added hosted builtin tools + client-side function tools and improved tool-call streaming/completion behavior. [#13406](#13406) [#13414](#13414) [#13506](#13506) [#13555](#13555) - **Input and composition UX upgraded** β Added AI input auto-completion and multiple chat-input stability fixes. [#13458](#13458) [#13551](#13551) [#13481](#13481) - **Model/provider compatibility improved** β Better Gemini/Google tool schema handling and additional model updates. [#13429](#13429) [#13465](#13465) [#13613](#13613) - **Desktop and CLI reliability improved** β Gateway WebSocket support and desktop runtime upgrades. [#13608](#13608) [#13550](#13550) [#13557](#13557) - **Security hardening continued** β Fixed auth and sanitization risks and upgraded vulnerable dependencies. [#13535](#13535) [#13529](#13529) [#13479](#13479) ### Models & Providers - Added/updated support for `glm-5v-turbo`, GLM-5.1 updates, and qwen3.5-omni series. [#13487](#13487) [#13405](#13405) [#13422](#13422) - Added additional ImageGen providers/models (Wanxiang 2.7 and Keling from Qwen). [#13478](#13478) - Improved Gemini/Google tool schema and compatibility handling across runtime paths. [#13429](#13429) [#13465](#13465) [#13613](#13613) ### Response API & Runtime - Added hosted builtin tools in Response API and client-side function tool execution support. [#13406](#13406) [#13414](#13414) - Improved stream tool-call argument handling and `response.completed` output correctness. [#13506](#13506) [#13555](#13555) - Improved runtime error/context handling for intervention and provider edge cases. [#13420](#13420) [#13607](#13607) ### Desktop App - Bumped desktop dependencies and runtime integrations (`agent-browser`, `electron`). [#13550](#13550) [#13557](#13557) - Simplified desktop release channel setup by removing nightly release flow. [#13480](#13480) ### CLI - Added OpenClaw migration command. [#13566](#13566) - Added local device binding support for `lh agent run`. [#13277](#13277) - Added WebSocket gateway support and reconnect reliability improvements. [#13608](#13608) [#13418](#13418) ### Security - Removed risky `apiKey` fallback behavior in webapi auth path to prevent bypass risk. [#13535](#13535) - Sanitized HTML artifact rendering and iframe sandboxing to reduce XSS-to-RCE risk. [#13529](#13529) - Upgraded nodemailer to v8 to address SMTP command injection advisory. [#13479](#13479) ### Bug Fixes - Fixed image generation model default switch issues. [#13587](#13587) - Fixed subtopic re-fork message scope behavior and agent panel reset edge cases. [#13606](#13606) [#13556](#13556) - Fixed chat-input freeze on paste and mention plugin behavior. [#13551](#13551) [#13415](#13415) - Fixed auth/social sign-in and settings UX edge cases. [#13368](#13368) [#13392](#13392) [#13338](#13338) ### Credits Huge thanks to these contributors: @chriszf @hardy-one @Innei @lijian @neko @OctopusNote @rdmclin2 @rivertwilight @RylanCai @suyua9 @sxjeru @Tsuki @wangyk @WindSpiritSR @yizhuo @YuTengjing @hezhijie0327 @arvinxx
π» Change Type
π Related Issue
π Description of Change
Fix Response API streaming for tool calls. The internal
tools_callingstream chunks contain accumulated arguments (the full string up to that point), but the Response API was treating each chunk as a complete independentoutput_itemβ emitting a full lifecycle (added β delta β done β item.done) per token delta and incrementingoutput_indexfrom 0 to 90+.Before: Each token delta creates a new output_item:
After: Single stable output_item with true incremental deltas:
Implementation:
Map<callId, {fcItemId, outputIndex, prevArguments}>output_item.added+ initial deltaargs.slice(prevArgs.length)arguments.done+output_item.donewhen stream ends ortool_endarrivesπ§ͺ How to Test
Test with:
POST /api/v1/responseswith a model that returns tool calls (e.g., code sandbox). Verify:output_indexstays stable per tool calldeltafields contain only incremental contentoutput_item.added/output_item.donepair per tool callresponse.completedoutput matches the streamed tool callπ€ Generated with Claude Code