fix(cli): prevent duplicate output on stream retry by liqiongyu · Pull Request #1704 · QwenLM/qwen-code

liqiongyu · 2026-02-03T14:35:05Z

Core emits GeminiEventType.Retry when a streaming attempt is retried due to invalid/partial output. The interactive UI previously ignored this event, so partial output from the failed attempt remained in the Ink history and the retried attempt re-streamed the same text, causing duplicated messages.

Changes:

Add removeItemsById to the history manager.
Track streamed history item ids for the current attempt and remove them on Retry.
Clear pending buffers/thought state and refresh the static UI.
Add regression tests for Retry handling.

Tests:

cd packages/cli && npx vitest run src/ui/hooks/useGeminiStream.test.tsx src/ui/hooks/useHistoryManager.test.ts src/ui/App.test.tsx

tanzhenxin · 2026-02-09T13:45:29Z

I’m not entirely sure why this duplicate stream output happens, but I have run into it a few times. I’ll try to track it down the next time it occurs. In the meantime, it might be worth adding some debug logs first so we can confirm whether this is actually the root cause.

github-actions · 2026-04-20T01:06:58Z

This pull request has had no activity for 60 days and is being marked as stale. It will be closed in another 30 days if no further activity occurs. To keep it open, push a new commit or leave a comment. Maintainers may apply pinned, status/blocked, status/on-hold, or status/ready-for-merge to exempt it from auto-close.

wenshao

No issues found. LGTM! ✅ — gpt-5.5 via Qwen Code /review

wenshao · 2026-04-24T14:43:15Z

Hi @liqiongyu, this PR currently has merge conflicts. Could you please rebase or merge the latest main and resolve them when you get a chance? Thanks!

liqiongyu · 2026-04-24T16:43:12Z

Updated this PR to address the retry cleanup review feedback.

Changes made:

Track citation and MAX_TOKENS info items as part of the discarded retry attempt, so stale citation metadata is removed with the partial response.
Scope retry discard IDs to the current stream attempt instead of shared hook state, so recursive continuations do not delete already-completed assistant output.
Added regression coverage for citation retries, concurrent /btw handling, and attempt-scoped retry cleanup.

Verification:

npm run build && npm run typecheck passed locally on Node 20.19.5.
GitHub CI is passing across Lint, CodeQL, and the macOS/Ubuntu/Windows test matrix on Node 20/22/24.

Latest head: 401e080

wenshao

No issues found. LGTM! ✅ — gpt-5.5 via Qwen Code /review

wenshao

Hi @liqiongyu — I tried to reproduce the original duplication scenario end-to-end via tmux against both main and this PR, with a deterministic injection that fires InvalidStreamError(NO_FINISH_REASON) once per process. The PR's regression tests pass, but the actual production-path duplication still reproduces on this branch. Wanted to share the trace before this lands.

How I reproduced

I patched geminiChat.ts to throw an InvalidStreamError once per process, after the inner stream loop completes (mirroring the real processStreamResponse validation path at geminiChat.ts:1111-1124):

// Inserted right after `for await (const chunk of stream) { ... }` in the
// outer retry loop, gated by attempt === 0 and a process-global flag.
if (attempt === 0 && !globalThis.__qwenRetryFired) {
  globalThis.__qwenRetryFired = true;
  throw new InvalidStreamError(
    'forced post-stream retry for visual reproduction',
    'NO_FINISH_REASON',
  );
}

Then ran npm run dev in a 200×50 tmux pane on each branch with the same prompt asking for a 3-paragraph essay separated by blank lines.

Result

main (no fix): duplication clearly visible — two ✦ markers, two complete essays. Expected.
This PR (with fix): duplication still visible — two ✦ markers, two complete essays. Not expected.

Why the regression tests pass but production doesn't

The 4 new tests in useGeminiStream.test.tsx > Retry Handling use mocked stream generators where the Retry event is yielded before any Finished event:

yield Content('abcdef');
yield Retry;            // ← retry first
yield Content('abcdef');
yield Finished(STOP);

But the real geminiChat.ts retry path looks different. The chunk loop fully drains, each chunk's finishReason causes turn.ts to yield Finished, then processStreamResponse validates and throws, then the outer loop yields RETRY. So the actual production sequence is:

Content → Content → … → Content → Finished(STOP) → Retry → Content → Finished(STOP)

Finished arrives before Retry, not after. None of the regression tests cover this ordering.

Root cause

I added trace logging at the retryDiscardHistoryItemIds.add(id) call sites and inside the Retry handler. Output during attempt 1:

[TRACE] split-flush id=…782 setSize=1            # P1 flushed to <Static>, tracked
[TRACE] split-flush id=…783 setSize=2            # P2 flushed to <Static>, tracked
[TEST PATCH] forcing post-stream NO_FINISH_REASON
[TRACE] Retry fresh-restart: size=0 ids=[]       # ← Set was already cleared

The Set is empty by the time the Retry handler runs, so the removeHistoryItemsById && retryDiscardHistoryItemIds.size > 0 guard skips the cleanup. The clear happens at useGeminiStream.ts:1271-1275 (see inline comment).

If I comment out that one .clear() and re-run the same scenario, the trace shows size=2 ids=[…782, …783] at the Retry handler and the duplication mostly disappears.

A second, related gap

After fixing the clear, one paragraph still leaks — the last paragraph of attempt 1. That comes from the new pending-flush block inside handleFinishedEvent:

if (finishReason !== FinishReason.MAX_TOKENS && pendingHistoryItemRef.current) {
  addItem(pendingHistoryItemRef.current, userMessageTimestamp);   // id discarded
  setPendingHistoryItem(null);
}

The returned id isn't captured, so this final flushed item can never be removed by a subsequent Retry. Same pattern as the split-flush branches — needs the id added to retryDiscardHistoryItemIds (gated by RETRY_DISCARD_HISTORY_ITEM_TYPES).

Suggested fix

Drop the retryDiscardHistoryItemIds.clear() on the non-MAX_TOKENS Finished branch. Let the next attempt's first content/thought event reset the tracking instead, or rely on the Retry handler itself.
In handleFinishedEvent, capture the id from addItem(pendingHistoryItemRef.current, …) and add it to retryDiscardHistoryItemIds if the type is in RETRY_DISCARD_HISTORY_ITEM_TYPES.
Add a regression test that mirrors the production ordering — Content → Finished(STOP) → Retry → Content → Finished(STOP) — and asserts removeHistoryItemsById is called with the IDs flushed during attempt 1, including the one finalized by the first Finished. The existing should discard streamed content on Retry to avoid duplicates test interleaves Content and Retry without an intervening Finished, which is why this gap slipped through.

This also retroactively justifies @tanzhenxin's earlier ask for log-based confirmation of the root cause — the symptom matches but the cause-effect chain has a separate failure mode that the unit tests don't model.

Happy to share the patch + tmux capture files if useful.

— claude-opus-4-7 via Claude Code /qreview

wenshao · 2026-04-25T04:22:36Z

+              event.value,
+              userMessageTimestamp,
+              retryDiscardHistoryItemIds,
+            );


[Critical] This clear() wipes attempt 1's tracked IDs before the subsequent Retry event can read them.

In the production path, Finished(STOP) arrives before Retry (because processStreamResponse validation in geminiChat.ts:1111-1124 runs after chunks drain and throws InvalidStreamError, which the outer retry loop then yields as RETRY). When this Retry arrives at line 1297, retryDiscardHistoryItemIds.size === 0 so the removeHistoryItemsById call is skipped — and attempt 1's flushed paragraphs stay in <Static>, producing the duplicate output the PR is supposed to fix.

Reproduced via tmux + a single-shot InvalidStreamError injection in geminiChat.ts (see the main review comment for full repro and trace). Removing this single line restores the fix's intended behavior for the most common Retry path in production.

Suggested change

);

waitingForMaxTokensRetry =

event.value.reason === FinishReason.MAX_TOKENS;

if (!waitingForMaxTokensRetry) {

geminiMessageBuffer = '';

thoughtBuffer = '';

}

— claude-opus-4-7 via Claude Code /qreview

Dismissing this approval — end-to-end tmux reproduction shows the fix does not cover the production retry path (Finished(STOP) before Retry). Details in the follow-up review: #1704 (review)

Handle GeminiEventType.Retry by discarding previously streamed partial output. Fixes QwenLM#1591

wenshao · 2026-04-26T21:44:48Z


+      if (
+        finishReason !== FinishReason.MAX_TOKENS &&
+        pendingHistoryItemRef.current


[Critical] handleFinishedEvent skips committing/tracking the pending assistant item when finishReason === FinishReason.MAX_TOKENS. The max-token escalation path can emit Finished(MAX_TOKENS) followed by a non-continuation Retry; because that pending item never receives an id in retryDiscardHistoryItemIds, the retry cleanup cannot reliably remove the stale partial output, so truncated output can remain alongside the retry output.

Please either commit and add discardable pending assistant/history items to retryDiscardHistoryItemIds for MAX_TOKENS as well, or explicitly clear/track the pending item in the non-continuation Retry path before the retry starts.

— gpt-5.5 via Qwen Code /review

liqiongyu requested review from DennisYu07, LaZzyMan, Mingholy, gwinthis, pomelo-nwu and tanzhenxin as code owners February 3, 2026 14:35

liqiongyu mentioned this pull request Feb 3, 2026

Message duplication bug #1591

Open

github-actions Bot added the status/stale No activity for extended period label Apr 20, 2026

github-actions Bot mentioned this pull request Apr 20, 2026

📊 AI CLI 工具社区动态日报 2026-04-20 gsscsd/big_model_radar#214

Open

wenshao previously approved these changes Apr 24, 2026

View reviewed changes

liqiongyu dismissed wenshao’s stale review via add2645 April 24, 2026 15:42

liqiongyu force-pushed the fix/1591-message-duplication branch 3 times, most recently from 3eda310 to 401e080 Compare April 24, 2026 16:20

github-actions Bot removed the status/stale No activity for extended period label Apr 25, 2026

wenshao previously approved these changes Apr 25, 2026

View reviewed changes

wenshao reviewed Apr 25, 2026

View reviewed changes

liqiongyu force-pushed the fix/1591-message-duplication branch from 401e080 to 28fe475 Compare April 25, 2026 06:18

fix(cli): prevent duplicate output on stream retry

9fc36b0

Handle GeminiEventType.Retry by discarding previously streamed partial output. Fixes QwenLM#1591

liqiongyu force-pushed the fix/1591-message-duplication branch from 28fe475 to 9fc36b0 Compare April 25, 2026 06:25

wenshao requested changes Apr 26, 2026

View reviewed changes

BingqingLyu mentioned this pull request Apr 27, 2026

fix(cli): prevent duplicate output on stream retry BingqingLyu/qwen-code#40

Open

qwen-code-ci-bot mentioned this pull request Jun 1, 2026

Bug Qwen3.7-max, operating on Ghostty v.1.3.1, build 15212 - Streaming/outputting repetitive garbage #4686

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): prevent duplicate output on stream retry#1704

fix(cli): prevent duplicate output on stream retry#1704
liqiongyu wants to merge 1 commit into
QwenLM:mainfrom
liqiongyu:fix/1591-message-duplication

liqiongyu commented Feb 3, 2026

Uh oh!

tanzhenxin commented Feb 9, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

wenshao left a comment

Uh oh!

wenshao commented Apr 24, 2026 •

edited

Loading

Uh oh!

liqiongyu commented Apr 24, 2026

Uh oh!

wenshao left a comment

Uh oh!

wenshao left a comment

Uh oh!

wenshao Apr 25, 2026

Uh oh!

wenshao Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-            );
+            waitingForMaxTokensRetry =
+              event.value.reason === FinishReason.MAX_TOKENS;
+            if (!waitingForMaxTokensRetry) {
+              geminiMessageBuffer = '';
+              thoughtBuffer = '';
+            }

Conversation

liqiongyu commented Feb 3, 2026

Uh oh!

tanzhenxin commented Feb 9, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liqiongyu commented Apr 24, 2026

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

How I reproduced

Result

Why the regression tests pass but production doesn't

Root cause

A second, related gap

Suggested fix

Uh oh!

wenshao Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

wenshao Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wenshao commented Apr 24, 2026 •

edited

Loading