fix(cli): submit fast tool results after stream end#5071
Conversation
| }); | ||
|
|
||
| await act(async () => { | ||
| await staleOnComplete?.([fastFailedTool]); |
There was a problem hiding this comment.
[Critical] TS2349: staleOnComplete is typed as ((completedTools: TrackedToolCall[]) => Promise<void>) | null, but fastFailedTool is TrackedCompletedToolCall. These types are incompatible — tsc --noEmit fails with "This expression is not callable." Tests pass at runtime because vitest doesn't typecheck.
| await staleOnComplete?.([fastFailedTool]); | |
| await (staleOnComplete as ((tools: TrackedCompletedToolCall[]) => Promise<void>) | null)?.([fastFailedTool]); |
Or update capturedOnComplete's type annotation to accept TrackedCompletedToolCall[].
— qwen3.7-max via Qwen Code /review
| @@ -2409,7 +2418,6 @@ export const useGeminiStream = ( | |||
| submitQuery(responsesToSend, SendMessageType.ToolResult, prompt_ids[0]); | |||
| }, | |||
| [ | |||
There was a problem hiding this comment.
[Critical] Missing cancellation guard: removing isResponding from the dependency array eliminates the stale-closure that accidentally protected against post-cancellation tool submissions. Now activeModelStreamsRef.current always reads 0 after stream abort, so handleCompletedTools proceeds to submitQuery at line 2418 — sending tool results (potentially containing file contents or shell output) to a new API call the user explicitly tried to cancel via Ctrl+C.
Scenario: Ctrl+C → stream aborts → counter→0 in finally → tool completes → activeModelStreamsRef.current > 0 guard passes → submitQuery fires unwanted API call. The submitQuery function resets turnCancelledRef.current = false, so the cancel state is lost.
Before this PR, the stale isResponding === true closure returned early. Now the ref is always current, making this path deterministic.
Fix: Add a cancellation check before submitQuery at line 2418:
if (turnCancelledRef.current || abortControllerRef.current?.signal.aborted) {
markToolsAsSubmitted(callIdsToMarkAsSubmitted);
return;
}— qwen3.7-max via Qwen Code /review
✅ Local verification report — PR #5071Verified locally by a maintainer on macOS — the OS this PR's matrix marked Environment
Results at a glance
1. Code reviewThe fix swaps the early-return guard in
2. Full suite on macOSMatches the author's Windows run and fills the macOS gap. 3. Test-effectiveness counter-proof — the decisive evidenceA regression test for a race is only worth as much as its ability to fail on the bug. I reverted only The single failure is the new test 4. Static checks
5. Real CLI end-to-end (tmux + mock OpenAI server, YOLO)This race lives in the React TUI path ( Observed in the TUI: Mock server log shows the two model requests, the second carrying the tool result back 145 ms after the first: The existence of turn 2 (
VerdictLGTM — recommend merge. Every claim in the PR test plan reproduces on macOS, the new regression test is proven to actually catch the race (red on the parent commit, green on the PR), and the real CLI handles the YOLO tool-error → continue path end-to-end with no regression. 🇨🇳 中文版(点击展开)✅ 本地验证报告 — PR #5071由维护者在 macOS 上本地验证 —— 正是本 PR 矩阵里标记 环境
结果速览
1. 代码审查修复把
2. macOS 全套测试与作者的 Windows 结果一致,补上了 macOS 的空缺。 3. 测试有效性反证 —— 决定性证据竞态的回归测试,其价值取决于它"在 bug 上能否变红"。我只把 唯一失败的就是新测试 4. 静态检查
5. 真实 CLI 端到端(tmux + mock OpenAI server,YOLO)这个竞态存在于 React TUI 路径( TUI 中观察到: mock server 日志显示两次模型请求,第二次在第一次之后 145 ms 把工具结果带了回来: turn 2(
结论LGTM —— 建议合并。 PR 测试计划中的每一项都在 macOS 复现;新回归测试被证明确实能抓住该竞态(在父提交上红、在 PR 上绿);真实 CLI 在 YOLO 工具出错 → 继续的路径上端到端工作,无回归。 |
8ed1615 to
1ae9e83
Compare
|
Updated the branch at 1ae9e83. This addresses both review blockers: the stale scheduler callback now checks the live cancellation state before submitting tool results, and the regression test now covers the post-cancel stale-callback path. I also fixed the test callback type issue called out in review. Verified locally: npm run test --workspace=packages/cli -- src/ui/hooks/useGeminiStream.test.tsx; npm run typecheck --workspace=packages/cli; npx prettier --check packages/cli/src/ui/hooks/useGeminiStream.ts packages/cli/src/ui/hooks/useGeminiStream.test.tsx; npx eslint packages/cli/src/ui/hooks/useGeminiStream.ts packages/cli/src/ui/hooks/useGeminiStream.test.tsx --max-warnings 0. |
🔁 Re-verification — head
|
|
@qwen-code /triage |
|
Thanks for the PR! Template looks good ✓ — all required sections present, bilingual body, test plan with regression command, linked issue. On direction: aligned. #4672 is a genuine user-facing stall in AUTO_EDIT / YOLO loops — an agent that silently stops after a read/edit error until the next prompt is exactly the kind of reliability bug that erodes trust. Fixing the tool-result handoff race sits squarely in qwen-code's core mission (reliable agent loop). The project has a track record of similar stale-closure fixes: PR #4470 (text buffer submit handler), PR #2374 (prompt cancellation and streaming), and the original "Fix race condition in submitQuery preventing tool response continuations" (#458). CHANGELOG doesn't reference #4672 yet, but the area is clearly relevant. On approach: scope feels right. Two files, one hook change plus regression coverage. The counter-based One observation for Stage 2: this race is timing-dependent at the sub-millisecond level, so a before/after terminal reproduction of the original #4672 stall isn't reliably triggerable by typing. The regression is best proven at the hook level — the new test captures the stale closure, ends the held stream, and fires a fast-failed Moving on to code review and testing. 🔍 中文说明感谢贡献! 模板完整 ✓ —— 所有必填章节齐全,双语正文,带回归命令的测试计划,关联 issue。 方向:对齐。#4672 是 AUTO_EDIT / YOLO 模式下一个真实的用户体验问题 —— read/edit 报错后 agent 悄悄停住、必须等下一条 prompt 才恢复,这类可靠性 bug 会直接损害用户对 agent loop 的信任。修复工具结果回传的竞态完全在 qwen-code 的核心职责范围内。项目里也有同类 stale-closure 修复的先例:PR #4470(文本 buffer 提交)、PR #2374(prompt 取消与流式)、以及最早的 #458。CHANGELOG 暂未引用 #4672,但该领域明显相关。 方案:范围合理。两个文件,一处 hook 修改加回归覆盖。用计数型 Stage 2 观察:这个竞态是亚毫秒级时序问题,没法靠在终端里打字稳定复现 #4672 的卡住现象。回归最好放在 hook 层证明 —— 新测试捕获旧闭包、结束 held stream、再通过旧回调触发一个快速失败的 进入代码审查和测试 🔍 — Qwen Code · qwen3.7-max |
2a. Code reviewIndependent proposal (before reading the diff): the root cause is Diff comparison: the PR does exactly this, plus it got both of the subtleties I'd expect experienced reviewers to push on:
No correctness bugs, no security holes, no regressions, no AGENTS.md violations. The implementation matches my independent proposal and is as tight as the problem allows. 2b. TestingHook-level test suite (tmux, this triage run)Before (main @ dc6edcd, PR not applied): After (main @ dc6edcd + PR 5071 commit 1ae9e83 cherry-picked): The delta is the two new regression tests added by the PR: Why no before/after TUI reproductionThe #4672 race requires the stream to end, the tool to fail within the same React microtask window, and the stale Independent maintainer verification (already on the PR)@wenshao independently verified the PR on both macOS (comment 4698512525) and Linux (comment 4699396894, head
SummaryCode review clean. Tests green at the hook level and at the CLI level (maintainer-verified on macOS + Linux). The cancellation guard closes the only remaining hole the first iteration left open. I don't see anything left to ask of the contributor. 中文说明2a. 代码审查独立方案(读 diff 之前):根因是 Diff 对比:PR 完全按这个思路做,并且把我预期会被资深 reviewer 挑出的两个细节都做对了:
没有正确性 bug,没有安全漏洞,没有回归,没有违反 AGENTS.md。实现与我的独立方案一致,并且就这个问题而言已经是最精简的。 2b. 测试Hook 层测试套件(tmux,本次 triage 运行)Before(main @ dc6edcd,未合 PR): After(main @ dc6edcd + cherry-pick PR 5071 commit 1ae9e83): 差值就是 PR 新增的两个回归测试: 为什么没有 before/after TUI 复现#4672 竞态要求流结束、工具在同一 React microtask 窗口内失败、旧 独立 maintainer 验证(已在 PR 上)@wenshao 在 macOS(comment 4698512525)和 Linux(comment 4699396894,head
小结代码审查无问题。Hook 层和 CLI 层测试全绿(maintainer 在 macOS + Linux 上独立验证)。取消 guard 堵住了第一版留下的唯一漏洞。我想不到还需要 contributor 再改什么。 — Qwen Code · qwen3.7-max |
|
Stepping back: this PR does exactly what I'd have written if asked to fix #4672 from scratch. The problem is a real, user-visible stall in the agent loop — the kind of thing that, left unfixed, teaches users to distrust AUTO_EDIT / YOLO and fall back to clicking Approve on every tool call. The fix is 12 lines of hook code plus a counter-based lifetime ref, which is about as small as a correct fix for a React closure race can be. The cancellation guard isn't scope creep — it closes a real second hole that the first iteration of the review caught and the contributor addressed cleanly. My independent proposal (counter ref + cancellation guard) matches the PR's approach. I don't see a simpler path that also handles Testing: hook suite 103/103 → 105/105 across the cherry-pick, both new tests green on first local run. Independent maintainer verification (@wenshao) already covered the parts I can't easily do in this environment — macOS verification, test-effectiveness counter-proof (revert source, keep test, only new test goes red), full CLI end-to-end in tmux showing the loop continuing past the tool error, and re-verification of both If I had to maintain this in six months: the counter semantics are clear, the cancellation guard has a comment explaining why it Am I approving because I ran out of reasons to say no, or because it's genuinely ready? Genuinely ready. The bug is real, the fix is minimal and correct, the tests prove it at the right level, and two independent maintainers have now exercised it on real hardware across both major dev platforms. Nothing left to ask of the contributor. 中文说明退一步看:这个 PR 跟让我从零修 #4672 会写的方案一模一样。问题是真实、用户可见的 agent loop 卡住 —— 这种 bug 如果不修,会教会用户不信任 AUTO_EDIT / YOLO、退回到每次工具调用都点 Approve。修法就是 hook 里 12 行代码加一个计数型生命周期 ref,对一个 React 闭包竞态的正确修复来说基本就是最小规模。取消 guard 不是范围蔓延 —— 它堵住了第一轮 review 发现的第二个真实漏洞,contributor 也干净地解决了。 我的独立方案(计数 ref + 取消 guard)与 PR 一致。我找不到既更简单又能处理 测试:hook 套件 103/103 → cherry-pick 后 105/105,两个新测试首次本地跑就绿。独立 maintainer 验证(@wenshao)已经覆盖了我这个环境里不容易做的部分 —— macOS 验证、测试有效性反证(回滚源码保留测试、只有新测试变红)、tmux 里完整 CLI 端到端显示循环在工具错误后继续、在 Linux 上对 rebased head 六个月后让我维护:计数语义清晰,取消 guard 有注释解释为什么 我是因为找不到反对理由才批,还是真的准备好了?真的准备好了。bug 真实,修法最小且正确,测试在正确层级证明它,两位独立 maintainer 已经在两大主流开发平台的真机上验证过。没什么需要 contributor 再改的了。 — Qwen Code · qwen3.7-max |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
LGTM, looks ready to ship. ✅
What this PR does
Fixes the tool-result handoff race where a very fast tool completion could be dropped after the model stream had already ended but before React replaced
handleCompletedToolswith a fresh callback. The hook now tracks the actualsendMessageStreamlifetime in a synchronous ref and uses that for the active-stream guard, while keeping the existingisRespondingstate for UI rendering.Why it's needed
#4672 reports AUTO_EDIT / YOLO turns stalling after a read/edit error until the next user instruction. The maintainer root-cause analysis points to
handleCompletedToolsreading staleisRespondingfrom a React callback closure: the stream finishes, a tool fails immediately, and the old callback still seesisResponding === true, returns early, and never sends the tool error back to the model. AUTO_EDIT / YOLO removes the approval delay, so the race is easier to hit there.Reviewer Test Plan
How to verify
Run
npm run test --workspace=packages/cli -- src/ui/hooks/useGeminiStream.test.tsx. The new regression test issubmits a fast tool result after the stream ended but before React replaces the callback; it saves the callback captured while React state still says responding, ends the held stream, then fires a fast failedread_fileresult through that stale callback and verifies a ToolResult request is sent.Evidence (Before & After)
Before: the stale callback path returned at the
isRespondingguard and dropped the tool result, so the secondsendMessageStreamcall never happened. After: the guard reads the actual stream-lifetime ref, sees that the stream has ended, marks the tool submitted, and sends the failed tool result back to the model.Tested on
Environment (optional)
Windows 11, Node v24.15.0, npm 11.12.1.
Validation run locally:
Risk & Scope
/btw, do not clear the guard early.Linked Issues
Fixes #4672
中文说明
What this PR does
修复一个工具结果回传的竞态:模型流已经结束,但 React 还没来得及把
handleCompletedTools换成新闭包时,快速完成或快速失败的工具结果可能被旧闭包丢掉。现在 hook 用同步 ref 记录真实的sendMessageStream生命周期,并用它做 active-stream guard;原来的isRespondingstate 仍然只负责 UI 渲染语义。Why it's needed
#4672 反馈 AUTO_EDIT / YOLO 模式下读文件或编辑出错后,本轮不会继续更新文件,需要下一次用户指令才恢复。maintainer 已定位根因:
handleCompletedTools从 React callback 闭包里读到旧的isResponding。时序是模型流先结束,工具马上失败,旧 callback 仍认为isResponding === true,于是提前 return,模型收不到工具错误,agent loop 卡住。AUTO_EDIT / YOLO 没有 approval 延迟,所以更容易撞到这个窗口。Reviewer Test Plan
How to verify
运行
npm run test --workspace=packages/cli -- src/ui/hooks/useGeminiStream.test.tsx。新增回归测试submits a fast tool result after the stream ended but before React replaces the callback:测试先保存 React state 仍是 responding 时捕获的旧 callback,再结束 held stream,然后让一个快速失败的read_file结果走这个旧 callback,验证它仍会发送 ToolResult 请求。Evidence (Before & After)
修复前:旧 callback 会在
isRespondingguard 直接 return,工具结果被丢弃,第二次sendMessageStream不会发生。修复后:guard 读取真实 stream-lifetime ref,发现模型流已经结束,于是标记工具已提交,并把失败的工具结果回传给模型。Tested on
Environment (optional)
Windows 11,Node v24.15.0,npm 11.12.1。
Risk & Scope
/btw这类重叠 stream 提前清掉 guard。Linked Issues
Fixes #4672