fix(cli): harden TUI flicker and streaming output stability#3663
fix(cli): harden TUI flicker and streaming output stability#3663chiga0 wants to merge 17 commits into
Conversation
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
chiga0
left a comment
There was a problem hiding this comment.
两条 follow-up 建议(可在本 PR 追加 commit,也可拆成独立小 PR),目标是把 #3279 的窄屏 markdown 重复渲染场景体验进一步收敛。
1. Widen findLastSafeSplitPoint to also recognize closed tables / lists / code blocks
当前 slicePendingTextForHeight 在溢出时把 markdown 渲染降级为扁平 <Text wrap="wrap">,丢失表格框线、syntax highlight、headers 等。这是 #3279 截图里「窄屏写一张大表格 / 长 mermaid 代码块」场景的常见 UX 退化点:流式过程中用户长时间只能看到 ... first N streaming lines hidden ... + 无格式裸文。
根因不在本 PR 引入的 slicer,而在 packages/cli/src/ui/utils/markdownUtilities.ts 的 findLastSafeSplitPoint 仅以 \n\n 作为安全 split:表格 / 列表 / 完整代码块在没有空行时全段都留在 pending。建议把以下三类纳入安全 split point,让 stable prefix 早一步进入 <Static>:
- 完整代码块结束(match 配对的 fence 之后)
- 表格段结束(最后一行匹配
tableRowRegex后下一行不再匹配) - 列表段结束(连续 list item 行后第一个非 list 行)
这样窄屏溢出会更少,而且当真的溢出时只剩 tail 是裸文本,已成块的内容仍然以渲染态出现在 Static 里,对 #3279 的体验改善更直接。
2. PENDING_PREVIEW_RESERVED_ROWS = 4 应从已计算的 controlsHeight 推导
packages/cli/src/ui/components/messages/ConversationMessages.tsx 里 PENDING_PREVIEW_RESERVED_ROWS = 4 是个静态常量,意图是为 LoadingIndicator + InputPrompt + Footer + spacer 留出余量。但 footer 在 showShortcuts / KeyboardShortcuts / embeddedShellFocused / agents.size > 0(额外 tab bar)等状态下高度会变化,AppContainer 已经在 controlsHeight + tabBarHeight 里测过实际值(measureElement(fullFooterRef),见 AppContainer.tsx 中 controlsHeight / availableTerminalHeight 的计算)。
建议把 availableTerminalHeight 直接作为 slicer 的 maxHeight 传入并把 reservedRows 设为 0(或保留 1-2 行小安全余量处理 marginTop),避免 footer 高度变动时 4 行还不够(仍会触发 Ink 全屏 clear)或多余裁剪一行 hidden-line banner。改动量很小,可以并入本 PR 一起 metric 验证。
另:建议在 PR description 顶部 / commit message 里加 Closes #3279,让 GitHub 自动关闭对应 issue,路径 A(resize-during-stream)和路径 B(pending overflow)当前的修复都已覆盖那张 issue 的截图场景。
|
Handled the latest #3279 follow-up comments in commit 369e1ec.\n\n- Expanded |
chiga0
left a comment
There was a problem hiding this comment.
Re-reproduction on codex/tui-streaming-clear-storm branch — slicer 仍未拦住 mermaid 代码块场景
附用户在 #3279 上跟进的窄屏复现截图(已 cherry-pick 本 PR 全部 commit):
- 顶部是助手前缀
✦ Here is a comprehensive Mermaid flowchart example: - 之后 scrollback 中堆了 20+ 条完全一致的
1 flowchart TD(dim 行号1+ 高亮flowchart TD) - 每条都是同一个代码块的 line 1,证实「每帧 pending 溢出 viewport,最顶行被推进 scrollback,下一帧再次写入又泄漏一行」的经典 log-update 钳制问题仍在
为什么 slicePendingTextForHeight 没拦住
slicer 用的是「源文本 在 markdownWidth 下软换行后的视觉行数」作为渲染高度的代理。这对 plain markdown 准确,但对 MarkdownDisplay 的两类 block 系统性低估:
| Block 类型 | 源文本视觉高度 | 实际渲染高度 | 差距来源 |
|---|---|---|---|
| Fenced code block | N 行 + fence 2 行 | N 行 + 围框 / paddingLeft 让有效宽度更窄 → 内部行更容易二次软换行 + colorizeCode 加了 line-number 前缀(侵占 ~2 列) |
MarkdownDisplay.tsx RenderCodeBlock 的 paddingLeft={CODE_BLOCK_PREFIX_PADDING} + colorizeCode line-number 前缀 |
| Markdown table | M 行 | 2M+3 行(border 顶/底/分隔 + 每行单独包 <Box>) |
TableRenderer.tsx 的 renderBorderLine |
| Empty line | 1 行 | 1 行 | 持平 |
| List item | 1 行 | 1 行 + paddingLeft={indentation+1} 缩窄宽度 → 易二次换行 |
MarkdownDisplay.tsx RenderListItem |
截图里 mermaid 代码块就踩了 codeblock + 行号宽度收缩两条:源文本可能只有 6-8 行,slicer 觉得「在 availableTerminalHeight - 4 内」,但渲染后被收缩成 12-20 行,超出 viewport,log-update 钳制 → scrollback 累积。
两个互补的修复方向
方向 A(推荐,最稳):streaming 期间不走 markdown 富渲染
把 PrefixedMarkdownMessage / ContinuationMarkdownMessage 在 isPending === true 时始终走 PendingTextPreview,无论 hiddenLinesCount 是否 > 0:
if (isPending) {
// Always plain text during streaming — markdown rendering only happens
// once content is promoted into <Static>. Prevents code-block / table
// rendered height from exceeding the slicer's source-text estimate.
return <PendingTextPreview ... />;
}
return <MarkdownDisplay ... />;收益:
- slicer 的 height 测量和实际渲染对得上(都是 plain
<Text wrap="wrap">),不会再有「源 8 行 / 渲染 18 行」错位 - 流式过程中 React reconcile 成本骤降(不再每 chunk 重新跑 markdown parser + syntax highlighter,正好和 doc §4.1「markdown 应按 block/token 缓存」一致)
- 与 Claude Code 的 streaming 渲染语义对齐(活动区永远是 plain 缓冲,formatted 渲染只在 commit 后发生)
代价:
- 流式过程中用户暂时看不到表格框线、代码块语法高亮、headers
- 一旦 prefix 被
findLastSafeSplitPoint提升进<Static>(或流结束),格式化版本完整展示
复现路径同 doc 里的 streaming-clear-storm fixture,metric 期望仍是 clearTerminalPairCount === 0。
方向 B(fallback / 配合):slicer 加结构感知的 padding 估算
如果坚持流式期间保留 markdown 渲染,需要让 slicer 「预知」渲染开销:
- 检测
```fence → 每对 fence 多预留2 + (fence 内行数 × ceil(行内容宽 / (markdownWidth - codeBlockPaddingLeft - lineNumberWidth)))行 - 检测连续
| … |表格段 → 预留2N + 3行 - 检测 list item → 按
markdownWidth - leadingWhitespace - prefixLen二次估算软换行
这条路工程上脆而易漏,且 colorizeCode 的 line-number 宽度依赖 lang/setting,建议只作 A 不可行时的兜底。
方向 C(更长线,doc §4.1「markdown 按 block 缓存」):把已闭合的 code block / table / list 段直接进 findLastSafeSplitPoint
我前一条 comment 提的 split point 拓宽,对当前截图也是有效拦截:mermaid 代码块一旦闭合就立即进 <Static>,不再每帧重渲染,pending 只剩 fence 之外的尾部 plain 文本,叠加方向 A 后窄屏 streaming 才真正「无重复输出」。
建议合 A + 我前一条 comment 的 split point 拓宽,作为 Closes #3279 的最后一公里。复现 fixture 直接用截图里的 prompt(让模型生成 mermaid + 表格 + task list + LaTeX 的综合 markdown 例子)配 40×24 终端。
#3279) The slicer measured source-text height, but MarkdownDisplay's code blocks (line-number prefix + paddingLeft-narrowed wrap), tables, and list items all render taller than their source. On narrow terminals that gap let pending content exceed the viewport, so Ink's log-update under-cleared and leaked the topmost row into scrollback every frame. Render the still-streaming tail through PendingTextPreview unconditionally so slicer measurement and rendered height stay aligned; once a stable prefix is promoted to <Static>, the committed message still gets full markdown formatting. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
|
Followed up on the remaining #3279 screenshot issue in commit 992de1c.\n\nThe prior follow-up made pending Markdown plain, but it still trusted the pre-sliced source-height estimate. The screenshots show the remaining failure mode: on very narrow panes, actual Ink wrapping can still exceed the estimate and leak the top rendered row into scrollback repeatedly.\n\nChange made:\n- Wrap the pre-sliced pending assistant/thought tail in |
|
The latest screenshots exposed one more #3279 failure mode: even with a hard |
|
Follow-up after comparing with the local Claude Code source and the latest narrow Markdown screenshots. What changed:
Latest fixed-branch validation: {
"streamPayload": "markdown",
"framesCaptured": 93,
"clearTerminalPairCount": 0,
"clearScreenCodeCount": 0,
"finalDoneCount": 1,
"hiddenMarkerCount": 0,
"rawMermaidFenceCount": 0,
"maxFrameHiddenMarkerCount": 0,
"maxFrameRawMermaidFenceCount": 0,
"pass": true
}Verification run:
|
cbb46bf to
2a9882b
Compare
2a9882b to
6c6bcf5
Compare
Summary
Closes #3279.
This PR consolidates the metric-backed TUI flicker, narrow-output, streaming blank-tail, OpenAI-compatible cumulative-delta, shell-output, and tool/detail stability fixes into one focused PR.
It closes the locally reproduced classes below and ties each claim to source changes plus deterministic metrics:
MaxSizedBox, then caps it to a small live viewport;```mermaid;delta.contentchunks to suffixes before they reach the Gemini stream pipeline, preventing normal streaming table/Markdown sections from being appended repeatedly;QWEN_STREAM_DEBUG=1stream-delta metrics (rawDeltaBytes,emittedDeltaBytes,suppressedBytes, prefix overlap, cumulative/exact-repeat counters) to make provider/model/UI attribution easier;MarkdownDisplay;refreshStatic()paths from full-screenclearTerminalto targeted viewport repaint (cursorTo(0, 0)+eraseDown) plus static-history remount;/clearon its own terminal reset path and avoids the historical secondclearTerminalwrite;Problem Definition And Metrics
Raw full-screen clear signal:
Evidence levels:
origin/mainorigin/mainorigin/mainShell live-output reflow uses a different metric because the bug is an extra live-output event after resize-only soft-wrap changes.
origin/mainNarrow Markdown/Mermaid scrollback evidence checks every captured live frame, not just the final screen:
Blank-tail evidence covers the manual issue where useful content is followed by a long empty streaming tail before the done event. The payload contains exactly six sentinel labels (
QWEN_A1throughQWEN_F1). A duplicated screen/scrollback shows an amplification ratio above 1.Cumulative-delta evidence covers OpenAI-compatible upstreams that send accumulated full text in
delta.contentinstead of incremental suffixes. The payload contains exactly eight table sentinel labels (QWEN_TABLE_01throughQWEN_TABLE_08). A duplicated stream shows an amplification ratio above 1.Claude Code Cross-Check
Local Claude Code source confirms this is not solved there by a bigger truncate threshold:
src/screens/REPL.tsxexposes only complete streaming lines to the renderer (visibleStreamingTextis truncated to the last newline).src/components/Markdown.tsxusesStreamingMarkdown, lexing from a monotonic stable boundary. Unclosed code fences remain Markdown structure, not ordinary raw text rows.src/ink/log-update.tsdetects diffs that would touch rows already in scrollback and falls back to a reset instead of patching unreachable rows.src/ink/renderer.tsandsrc/ink/ink.tsxkeep alt-screen frames fixed-height, clamp the cursor inside the viewport, and anchor the cursor before diffs.Copying Claude's modified Ink fork wholesale would import a large renderer/terminal lifecycle surface. This PR ports the relevant low-risk principles into Qwen's current architecture: bounded live viewport, no synthetic marker/fence rows in live pending scrollback, blank-tail suppression in the live viewport, committed Markdown fidelity through Static, cumulative-delta normalization before UI rendering, and frame-level evidence.
Diagnostics
Set
QWEN_STREAM_DEBUG=1to writestream_delta_metricsrecords to the normal debug log. The useful fields are:rawDeltaBytes: bytes received from the provider for this chunk;emittedDeltaBytes: bytes forwarded after normalization;suppressedBytes: repeated cumulative-prefix bytes removed;prefixOverlapBytes: bytes shared with previously emitted text;cumulativeDeltaCountandexactRepeatCount: stream-level counters.How to read them:
suppressedByteswithcumulativeDeltaCount > 0means provider cumulative delta behavior;suppressedBytespoints more toward model-generated repetition;UI Evidence
Please upload comparison GIFs and replace these placeholders with GitHub attachment URLs:
streaming-clear-storm-before-after.gifnarrow-streaming-resize-before-after.gifresize-clear-regression-before-after.gifshell-reflow-regression.gifnarrow-markdown-blank-tail.gifnarrow-markdown-table-cumulative-delta.gifWhat the current metric evidence proves:
delta.contentchunks no longer duplicate normal Markdown/table output during streaming.What it does not claim:
Verification
cd packages/core && npx vitest run src/core/openaiContentGenerator/converter.test.tscd packages/cli && npx vitest run src/ui/utils/markdownUtilities.test.ts src/ui/components/messages/ConversationMessages.test.tsx src/ui/components/messages/ToolMessage.test.tsx src/ui/AppContainer.test.tsxnpm run build && npm run bundlecd integration-tests/terminal-capture && npm run capture:narrow-markdown-regressioncd integration-tests/terminal-capture && npm run capture:narrow-markdown-stall-regressioncd integration-tests/terminal-capture && npm run capture:narrow-markdown-blank-tail-regressioncd integration-tests/terminal-capture && npm run capture:narrow-markdown-table-regressioncd integration-tests/terminal-capture && npm run capture:narrow-markdown-table-cumulative-regressiongit diff --checkLatest fixed cumulative-delta E2E summary:
{ "streamPayload": "markdown-table-cumulative", "framesCaptured": 103, "clearTerminalPairCount": 0, "clearScreenCodeCount": 0, "hiddenMarkerCount": 0, "rawMermaidFenceCount": 0, "maxFrameHiddenMarkerCount": 0, "maxFrameRawMermaidFenceCount": 0, "tableSentinelExpectedCount": 8, "tableSentinelOccurrenceCount": 8, "maxFrameTableSentinelOccurrenceCount": 8, "tableSentinelAmplificationRatio": 1, "maxFrameTableSentinelAmplificationRatio": 1, "pass": true }Latest fixed blank-tail E2E summary:
{ "streamPayload": "markdown-blank-tail", "framesCaptured": 103, "clearTerminalPairCount": 0, "clearScreenCodeCount": 0, "hiddenMarkerCount": 0, "rawMermaidFenceCount": 0, "maxFrameHiddenMarkerCount": 0, "maxFrameRawMermaidFenceCount": 0, "stallSentinelExpectedCount": 6, "stallSentinelOccurrenceCount": 6, "maxFrameStallSentinelOccurrenceCount": 6, "stallSentinelAmplificationRatio": 1, "maxFrameStallSentinelAmplificationRatio": 1, "minPostDoneViewportStallSentinelOccurrenceCount": 6, "pass": true }