fix(core): bound foreground shell output capture#4524
Conversation
pomelo-nwu
left a comment
There was a problem hiding this comment.
Hi @Jerry2003826, thank you for your continued contributions — 9 PRs in a short time is impressive! 🎉
As we review your changes, we'd like to ask you to update each PR to follow the latest PR template on the main branch. The most important section is the Reviewer Test Plan, which significantly accelerates the review and merge process.
Specifically, for each PR please include:
- How to verify — clear reproduction steps so a reviewer can confirm the fix/feature
- Evidence (Before & After) — use the
tmux-real-user-testingskill (or manual tmux capture) to show before/after screenshots of the TUI behavior. Side-by-side evidence makes it much faster for maintainers to validate and merge - Tested on — fill in the OS table (macOS / Windows / Linux)
PRs with a complete Reviewer Test Plan are prioritized for review — without it, review may be delayed.
You can see the full template at: .github/pull_request_template.md
Thanks again for your effort — looking forward to getting these merged! 🚀
中文说明
你好 @Jerry2003826,感谢你的持续贡献——短时间内提交了 9 个 PR,非常高效!🎉
在 review 过程中,我们希望你能按照 main 分支上最新的 PR 模版更新每个 PR 的描述。其中最关键的部分是 Reviewer Test Plan,它能显著加速审核和合并流程。
具体来说,请为每个 PR 补充:
- How to verify — 清晰的复现步骤,让 reviewer 能确认修复/功能的效果
- Evidence (Before & After) — 使用
tmux-real-user-testingskill(或手动 tmux 截取)展示修改前后的 TUI 截图对比,前后对比能让维护者更快地验证和合并 - Tested on — 填写操作系统测试表格(macOS / Windows / Linux)
有完整 Reviewer Test Plan 的 PR 会被优先审核——缺少该部分可能会导致审核延迟。
完整模版见:.github/pull_request_template.md
再次感谢你的付出,期待尽快把这些 PR 合并!🚀
— Qwen Code
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] rawOutput JSDoc (line 129) says "The raw, unprocessed output buffer" but it's now bounded to maxBufferedOutputBytes. Consider updating the JSDoc to note that the buffer may be incomplete when the limit is exceeded, and optionally adding a rawOutputTruncated?: boolean field to ShellExecutionResult so consumers can detect truncation programmatically.
— qwen3.7-max via Qwen Code /review
| { ...shellExecutionConfig, maxBufferedOutputBytes: 10 }, | ||
| ); | ||
|
|
||
| expect(result.rawOutput.length).toBeLessThanOrEqual(10); |
There was a problem hiding this comment.
[Suggestion] toBeLessThanOrEqual(10) would pass even if 0 bytes were captured, masking a total capture failure. With 15 bytes sent and limit=10, exactly 10 bytes should be captured.
| expect(result.rawOutput.length).toBeLessThanOrEqual(10); | |
| expect(result.rawOutput.length).toBe(10); |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Updated in 9910ff4. The PTY bounded-output test now asserts rawOutput.length === 10, so a total capture failure cannot pass.
|
|
||
| const result = await handle.result; | ||
|
|
||
| expect(result.rawOutput.length).toBeLessThanOrEqual(10); |
There was a problem hiding this comment.
[Suggestion] Same weak assertion as the PTY test above — toBeLessThanOrEqual(10) should be toBe(10).
| expect(result.rawOutput.length).toBeLessThanOrEqual(10); | |
| expect(result.rawOutput.length).toBe(10); |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Updated in 9910ff4. The child_process bounded-output test now asserts rawOutput.length === 10 exactly.
| let outputCaptureLimitExceeded = false; | ||
|
|
||
| const captureOutputData = (data: Buffer): Buffer | null => { | ||
| totalOutputBytes += data.length; |
There was a problem hiding this comment.
[Suggestion] totalOutputBytes is incremented inside captureOutputData, but this closure is only called when !streamStdout. In streaming mode totalOutputBytes stays at 0 forever. The PTY path correctly tracks totalBytesReceived unconditionally in handleOutput (line 1422). Recommend hoisting totalOutputBytes += data.length out of captureOutputData and into handleOutput before the streamStdout branch, matching the PTY pattern. Also consider renaming to totalBytesReceived for consistency.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 9910ff4. child_process now increments total byte accounting before the streamStdout branch and captures a bounded raw buffer for streaming output as well. Added a regression test for streamStdout: true with maxBufferedOutputBytes: 1 that asserts the data event still streams while rawOutput is bounded and result.output includes the capture-limit notice.
| totalOutputBytes += data.length; | ||
|
|
||
| if (capturedOutputBytes >= maxBufferedOutputBytes) { | ||
| outputCaptureLimitExceeded = true; |
There was a problem hiding this comment.
[Suggestion] No debugLogger.warn() is emitted when the output capture limit is exceeded. The only signal is a bracketed notice string appended to result.output. At 3AM, truncation is invisible to log-based debugging — grepping logs for "truncat", "exceeded", or "discarded" returns nothing.
Consider adding a one-time log when outputCaptureLimitExceeded first transitions to true:
if (captured.length < data.length) {
outputCaptureLimitExceeded = true;
debugLogger.warn(
`Shell output capture exceeded maxBufferedOutputBytes ` +
`(${maxBufferedOutputBytes} bytes). Total bytes: ${totalOutputBytes}. ` +
`Discarding excess.`,
);
}— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Added in 9910ff4. Both PTY and child_process paths now emit a one-time debugLogger.warn() when output capture first exceeds maxBufferedOutputBytes, including the configured limit and total bytes observed.
| @@ -337,6 +338,25 @@ describe('ShellExecutionService', () => { | |||
| expect(result.output.trim()).toBe('你好'); | |||
| }); | |||
|
|
|||
There was a problem hiding this comment.
[Suggestion] The two new tests cover the "two chunks, second partially captured" happy path, but three important scenarios are untested:
-
Exact-boundary case: send exactly
maxBufferedOutputBytes(10) bytes, assert no truncation notice appears. Guards against off-by-one regressions in>vs>=comparisons. -
getMaxBufferedOutputBytesvalidation: 6 input classes (0,-1,NaN,Infinity,'abc',undefined) should all return the 64MB default. Zero coverage on the validation fallback path. -
appendOutputCaptureLimitNoticewith empty output: theoutput ? \${output}\n\n${notice}` : notice` branch is untested.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Added in 9910ff4. The test file now covers exact-boundary behavior, invalid maxBufferedOutputBytes fallback values (0, -1, NaN, Infinity, abc, undefined), and the empty stripped-output branch where the capture-limit notice is returned by itself.
25d47d6 to
9910ff4
Compare
|
Updated in 9910ff4. This addresses the remaining review items: exact Validation run locally on Windows: npm run test --workspace=@qwen-code/qwen-code-core -- src/services/shellExecutionService.test.ts -t "capture-limit|bounds buffered|default capture|exact"
npm run test --workspace=@qwen-code/qwen-code-core -- src/services/shellExecutionService.test.ts
npx prettier --check packages/core/src/services/shellExecutionService.ts packages/core/src/services/shellExecutionService.test.ts
npx eslint packages/core/src/services/shellExecutionService.ts packages/core/src/services/shellExecutionService.test.ts
npm run typecheck --workspace=@qwen-code/qwen-code-core
git diff --check |
|
The JSDoc part is already updated in 9910ff4: I kept |
|
Cleaned up the PR description to match the latest template, removed the stale details block, and updated the Tested on table now that GitHub Actions is green across macOS, Windows, and Linux. Code is unchanged. |
wenshao
left a comment
There was a problem hiding this comment.
All 5 Suggestions from the prior review (at 29917ead) are addressed in this revision. The bounded capture implementation is sound across all exit paths and execution modes. 274 tests pass. LGTM! ✅ — qwen3.7-max via Qwen Code /review
本地验证报告(maintainer review)把 PR head 直接 结论✅ 建议 MERGE。PR 自带的全部单测在 macOS 本机复现通过;用真实 child_process / PTY 流过 200MB 输出做了 memory-headroom 实测 —— 在 cap=1MB 配置下, 验证矩阵
真机 E2E(直接驱动
|
| # | 场景 | 期望 | 实测 |
|---|---|---|---|
| 1 | PTY + dd ...| tr '\0' 'a' 5MB(无 \n,避开 PTY \n→\r\n 扩展),cap=1MB |
rawOutput.length=1MB + capture-limit notice |
✅ rawOutput 精确 1048576 bytes;notice = Output exceeded the maximum captured size of 1.0 MB; captured the first 1.0 MB of X MB ... |
| 2 | PTY 1MB 无换行流,cap=1MB | 不附 notice(精确边界) | ✅ rawOutput=1MB,无 notice |
| 3 | child_process yes a | head -c 5MB,cap=1MB |
rawOutput 限制 + notice + 总 size 准确 | ✅ rawOutput=1MB,notice 报告 "of 5.0 MB" |
| 4 | child_process maxBufferedOutputBytes=-1(非法值),1MB 输出 |
回退默认 64MB → 不附 notice | ✅ rawOutput=1MB,无 notice |
| 5 | child_process streaming streamStdout: true,cap=1MB,5MB 输出 |
live consumer 拿到全量、rawOutput 仍受限、final result 含 notice |
✅ live 收到 ≥ 5MB-4KB;rawOutput=1MB;最终 output 含 notice |
| 6 | child_process 默认配置(不传 maxBufferedOutputBytes),1MB 输出 |
远低于 64MB 默认 → 不附 notice | ✅ |
memory-headroom.mjs(200MB 输入 → 1MB cap,4 / 4 pass)
| 指标 | 值 |
|---|---|
| 输入命令 | dd if=/dev/zero bs=1M count=200 | tr '\0' 'a'(200 MB 真实流) |
maxBufferedOutputBytes |
1 MB |
rawOutput.length |
1048576 bytes (1.00 MB) ← 严格止住 |
| capture-limit notice | Output exceeded the maximum captured size of 1.0 MB; captured the first 1.0 MB of 200.0 MB and discarded the rest... |
| heap before / peak | 57 MB / 57 MB(Δ=0 MB) |
| RSS before / peak | 158 MB / 197 MB(Δ ≈ 40 MB,多数是 dd/tr 子进程 + 临时 buffer) |
| elapsed | 9.1 s |
| exitCode | 0 |
要点:把 cap 设成 1MB 后,200MB 输入 流过来,进程依然完整 drain(exitCode=0),但 V8 堆几乎纹丝不动。这是这个 PR 价值的硬证据 —— issue #4364 的"V8 max-string blowup"路径已被切断。
代码评审要点
maxBufferedOutputBytes配置贯通完整:Config.constructor+getShellExecutionConfig两处都加了对称传递,CLI 端调setShellExecutionConfig时也能下发;179 / 179 config 测试无回归。getMaxBufferedOutputBytes()的非法值兜底很严:typeof === 'number' && Number.isFinite() && > 0同时通过才采纳,否则一律回 64MB 默认。it.each([0, -1, NaN, Infinity, 'abc', undefined])单测 + 真机-1实测都验证回退正确,避免 用户写错配置时 cap 变成 0 或 -1 导致 captureOutputData 死锁。- sniff buffer 与最终 buffer 解耦:旧实现
outputChunks同时承担 binary-sniff + 最终 result decode,导致 sniff 完后必须强制清空outputChunks.length = 0;新实现拆出sniffChunks,outputChunks单纯做 capture-budget 内 buffer。逻辑边界更干净。 - PTY background-promote 路径同步限制:
shellExecutionService.ts:1911的 background-promote snapshot 也走appendOutputCaptureLimitNotice(与正常 exit 用同一辅助函数共 4 处调用::748、:868、:1567、:1911),进程被 promote 成后台时前 1MB 仍准确;JSDoc 把rawOutput说成 "bounded prefix buffer" 与运行时一致。 - streaming text mode 的处理细腻:
streamStdout: true时 live consumer 仍拿到全量 chunk(不会因为 cap 截断 UI 流),但rawOutput与 final resultoutput仍受 cap 约束。e2e 实测确认两路都对。 - debug warning 一次性:
outputCaptureLimitWarningEmitted守卫,避免日志洪泛。 - exact-boundary 不假阳性:
captured.length < data.length判断只在确实裁剪时 mark exceeded —— 避免 1MB 数据 + cap=1MB 时错误标记。单测 + PTY/child_process 真机各覆盖一遍。
风险与遗留
- ✅ 向后兼容:
maxBufferedOutputBytes是新增的可选字段,未配置时走 64MB 默认;现有调用零行为变化(除非命令真的产出 > 64MB,那本来就有崩溃风险)。 - ✅
rawOutput语义变更已显式记入文档:JSDoc 改为"bounded by maxBufferedOutputBytes, may contain only the retained prefix when the capture limit is exceeded",调用方需要全量 bytes 时应自行 streaming,符合 PR 的 risk note。 ⚠️ 仅 macOS arm64 真机验证(Node 22.17);Linux / Windows 走 GitHub Actions 已绿。shellExecutionService在 PTY 路径有平台分叉但本 PR 改动是 buffer accounting,不涉及 syscall。⚠️ background-promote 的 capture-limit 处理走的是和正常 exit 同一套appendOutputCaptureLimitNotice,但本机未单独构造 promote 触发的 e2e(单测 + 代码 review 已覆盖)。- ✅ 没有触及
truncateToolOutputThreshold:PR 明确把"上层 tool-output truncation"和"底层 buffer cap"区分开,不重叠。
验证环境:macOS Darwin 25.4.0 arm64,Node v22.17.0;tmux 会话 pr4524;merged worktree /private/tmp/pr4524-merged(branch pr-4524-test = PR head + git merge origin/main,merge commit d60603000);driver 脚本 realio-shell-bound.mjs 17 assertion + memory-headroom.mjs 4 assertion 全绿。— wenshao
|
Hi @Jerry2003826, thanks for the fix direction — bounding foreground shell output is definitely needed for long-running sessions. To help us move this forward, could you add a concrete reproduction case? Specifically:
The code itself looks reasonable, but without repro evidence we can't validate the threshold or confirm it doesn't regress normal use cases (e.g., a legitimate 2 MB Thanks! |
|
Updated the PR description with the requested reviewer evidence:
No code changes were needed for this update; existing CI and the maintainer verification remain applicable. |
BZ-D
left a comment
There was a problem hiding this comment.
LGTM. Output capture bounding logic is clean — sniff/capture separation is correct, streaming vs buffered paths are well-handled, and the appendOutputCaptureLimitNotice helper keeps the notice consistent across PTY, child_process, and background-promote paths. Tests cover boundary conditions (exact limit, invalid config values, streaming mode) thoroughly.
| disableDynamicLineTrimming?: boolean; | ||
| } | ||
|
|
||
| function getMaxBufferedOutputBytes(config: ShellExecutionConfig): number { |
There was a problem hiding this comment.
[Suggestion] getMaxBufferedOutputBytes validates that the value is a positive finite number but imposes no upper ceiling. A user (or a corrupted settings file) setting maxBufferedOutputBytes to a value near Node.js's buffer.constants.MAX_LENGTH (~2 GB) would cause Buffer.concat(outputChunks) to throw at finalization, and values in the hundreds of MB could trigger V8 string-length overflow in the child_process path's stdout += decodedChunk concatenation. Since the PR's stated purpose is bounding memory, the configurable limit itself can defeat that purpose when misconfigured.
Consider clamping to a safe ceiling:
const SAFE_CEILING = 256 * 1024 * 1024; // 256 MB
return Math.min(Math.floor(configured), SAFE_CEILING);— qwen-latest-series-invite-beta-v38 via Qwen Code /review
There was a problem hiding this comment.
Covered in the current branch. getMaxBufferedOutputBytes() now clamps positive finite values to MAX_BUFFERED_OUTPUT_BYTES_CEILING (256 MiB), so pathological user settings cannot request multi-GB retained buffers. Verified with the targeted shellExecutionService test, Prettier, and ESLint.
| // Ignore fallback rendering errors and resolve with empty text. | ||
| } | ||
| } | ||
| fullOutput = appendOutputCaptureLimitNotice( |
There was a problem hiding this comment.
[Suggestion] In the PTY finalize path, when the primary replayTerminalOutput (line 1552) throws, the catch fallback at line 1562 calls serializeTerminalToText(headlessTerminal). The headless terminal received every byte via headlessTerminal.write(data) regardless of the capture limit, so this fallback produces unbounded output — yet the appendOutputCaptureLimitNotice call right below still appends a notice saying "captured the first 64 MB ... and discarded the rest." The notice becomes misleading when the output actually contains more data.
The same pattern exists in the background-promote drain path (line 1891 fallback + line 1908 notice).
Consider decoding finalBuffer (the bounded buffer) in the fallback instead of serializing the full headless terminal:
} catch {
try {
const fallbackEncoding = getCachedEncodingForBuffer(finalBuffer);
fullOutput = new TextDecoder(fallbackEncoding).decode(finalBuffer);
} catch {
// Ignore fallback rendering errors and resolve with empty text.
}
}— qwen-latest-series-invite-beta-v38 via Qwen Code /review
There was a problem hiding this comment.
Covered in the current branch. The PTY replay fallback no longer serializes the full headlessTerminal; it falls back to decoding the bounded captured finalBuffer, then appends the capture-limit notice. Added keeps PTY replay fallback bounded after the capture limit is exceeded. Verified with the targeted shellExecutionService test, Prettier, and ESLint.
9910ff4 to
16f2e34
Compare
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — qwen-latest-series-invite-beta-v38 via Qwen Code /review
BZ-D
left a comment
There was a problem hiding this comment.
Reviewed the diff and ran the targeted shell execution tests; no blocking issues found.
|
Resolved the merge conflict with latest Conflict resolution:
Validation run locally:
Note: after merging latest |
|
Cleaned up the remaining garbled details block in the PR description. The requested reproduction steps, before/after memory evidence, cap rationale, and OS table remain in the Reviewer Test Plan. Code unchanged; current CI is green. |
DragonnZhang
left a comment
There was a problem hiding this comment.
Reviewed the full diff (3 files, +407/-30) and the prior review thread.
Summary: Bounds foreground shell output retained in memory via maxBufferedOutputBytes (default 64 MB, ceiling 256 MB). The process is still fully drained and streaming consumers still receive the complete output; only the final retained rawOutput / decoded result is bounded. A capture-limit notice is appended when truncation occurs.
Assessment:
- Bounded capture is correctly applied across all exit paths: normal child_process exit, normal PTY exit, PTY background-promote snapshot, and PTY replay fallback.
- Sniff/capture separation is clean —
sniffChunksfor binary detection,outputChunksfor bounded capture. The old shared-array bug (recomputingsniffedBytesfromslice(0, 20)on every call) is fixed as a side-effect. getMaxBufferedOutputBytesvalidates input thoroughly: rejects 0, negative, NaN, Infinity, non-numbers; clamps to 256 MB ceiling.- One-time
debugLogger.warn()fires when the capture limit is first exceeded — good for observability. - Test coverage is comprehensive: exact-boundary behavior, invalid config fallback (6 values), empty stripped-output, streaming + capture-limit, PTY replay fallback, and child_process streaming byte accounting.
- All 7 prior inline review comments are addressed in this revision.
- Deterministic checks (tsc, eslint): 0 findings attributed to this PR's changes.
No blocking issues found.
— qwen-code via Qwen Code /review
| const configured = config.maxBufferedOutputBytes; | ||
| return typeof configured === 'number' && | ||
| Number.isFinite(configured) && | ||
| configured > 0 |
There was a problem hiding this comment.
[Suggestion] A fractional value like 0.5 passes the > 0 guard but Math.floor(0.5) produces 0, making captureOutputData immediately return null on every chunk — all output is silently discarded. This is inconsistent with 0 itself, which correctly falls back to the 64 MB default.
Floor first, then check positivity:
| configured > 0 | |
| function getMaxBufferedOutputBytes(config: ShellExecutionConfig): number { | |
| const configured = config.maxBufferedOutputBytes; | |
| const floored = | |
| typeof configured === 'number' && Number.isFinite(configured) | |
| ? Math.floor(configured) | |
| : 0; | |
| return floored > 0 | |
| ? Math.min(floored, MAX_BUFFERED_OUTPUT_BYTES_CEILING) | |
| : DEFAULT_MAX_BUFFERED_OUTPUT_BYTES; | |
| } |
— qwen3.7-plus via Qwen Code /review
There was a problem hiding this comment.
Handled in the current branch. getMaxBufferedOutputBytes() now floors finite numeric input before checking positivity, so fractional values below 1 fall back to the default capture limit instead of becoming a zero-byte limit.
Added 0.5 to the invalid maxBufferedOutputBytes regression cases.
Validated:
npm run test --workspace=packages/core -- src/services/shellExecutionService.test.ts -t "invalid maxBufferedOutputBytes"npm run test --workspace=packages/core -- src/services/shellExecutionService.test.tsnpx prettier --check packages/core/src/services/shellExecutionService.ts packages/core/src/services/shellExecutionService.test.tsnpx eslint packages/core/src/services/shellExecutionService.ts packages/core/src/services/shellExecutionService.test.ts
…ground-stdout # Conflicts: # packages/core/src/services/shellExecutionService.ts
1f03e74 to
a8cc894
Compare
DragonnZhang
left a comment
There was a problem hiding this comment.
Reviewed the full diff (3 files, +417/-30) and prior review thread.
Summary: Bounds foreground shell output retained in memory via maxBufferedOutputBytes (default 64 MB, ceiling 256 MB). The process is still fully drained and live streaming is unaffected; only the final retained rawOutput / decoded result is capped.
What looks good:
- Clean separation of sniff and capture buffers, so binary detection and bounded capture are independent concerns.
getMaxBufferedOutputByteshandles all invalid-input edge cases (0, negative, NaN, Infinity, fractional, non-number, undefined) with a single coherent validation path.- Capture-limit notice is applied consistently across all four resolution paths: child_process normal exit, child_process background-promote, PTY normal exit, and PTY background-promote.
- PTY replay fallback now decodes the bounded captured buffer instead of serializing the full headless terminal, closing the last unbounded path.
- One-time
debugLogger.warn()on capture-limit breach leaves a diagnostic trail without log spam. - Test coverage is thorough: exact-boundary, invalid-config parametrized, streaming with capture limit, empty-stripped-output, and replay-fallback scenarios.
All prior review suggestions have been addressed. CI green (24/24). LGTM.
— qwen3-coder via Qwen Code /review
|
Added the concrete reproduction details to the Reviewer Test Plan in the PR description. It now includes:
|
What this PR does
Bounds foreground shell output retained in memory and adds a capture-limit notice when the retained output is truncated.
Why it's needed
Very large foreground stdout/stderr can consume excessive memory and make the session unstable if retained unbounded. This is separate from higher-level tool-output truncation: the process still drains and streamed output still reaches live consumers, but the final retained
rawOutput/ decoded result is bounded.Reviewer Test Plan
How to verify
Run the regression checks:
Minimal manual reproduction on macOS/Linux:
Equivalent cross-platform payload for direct shell-service/manual testing:
node -e "process.stdout.write('a'.repeat(50 * 1024 * 1024))"On Windows, monitor node RSS with:
Evidence (Before & After)
Before this PR, foreground output capture appended every stdout/stderr chunk into retained output buffers. A command emitting 50 MB retained roughly 50 MB for final
rawOutput/ decoded result; longer streams continued growing toward V8 string/memory limits.After this PR, retained final output is bounded by
maxBufferedOutputBytes, while the child process is still fully drained and live streaming behavior remains unchanged.Maintainer real-I/O verification on the PR branch used a 200 MB stdout stream with
maxBufferedOutputBytes = 1 MB:rawOutput.length === 1,048,576bytesThe 2 MB normal-use case is not regressed: 2 MB is well below the 64 MB default cap, so it remains fully retained unless a user explicitly configures a smaller cap.
Cap value rationale
The default
maxBufferedOutputBytesis 64 MB because it is:git logto remain intact;Invalid values (
0, negative numbers,NaN,Infinity, non-numbers, orundefined) fall back to this default.Tested on
Environment (optional)
Local Windows/PowerShell checkout with repository npm workspaces. The memory-specific reproduction is numeric rather than TUI-screenshot based because the behavior is core shell-service retention, not a visible TUI layout change.
Risk & Scope
rawOutputJSDoc documents that it is a bounded prefix when the capture limit is exceeded.Linked Issues
Fixes #4364