-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat(core): auto-dump memory diagnostics to disk on pressure detection #4651
Copy link
Copy link
Closed
Labels
category/performancePerformance and optimizationPerformance and optimizationroadmap/context-performanceRoadmap: Context and performanceRoadmap: Context and performancescope/memory-usageMemory consumptionMemory consumptionstatus/ready-for-agentFully specified; an AFK coding agent can pick this up with no human contextFully specified; an AFK coding agent can pick this up with no human contexttype/feature-requestNew feature or enhancement requestNew feature or enhancement request
Metadata
Metadata
Assignees
Labels
category/performancePerformance and optimizationPerformance and optimizationroadmap/context-performanceRoadmap: Context and performanceRoadmap: Context and performancescope/memory-usageMemory consumptionMemory consumptionstatus/ready-for-agentFully specified; an AFK coding agent can pick this up with no human contextFully specified; an AFK coding agent can pick this up with no human contexttype/feature-requestNew feature or enhancement requestNew feature or enhancement request
Type
Fields
Give feedbackNo fields configured for issues without a type.
Parent
Part of #3000 (Memory Diagnostics roadmap). Supersedes #4181 + #4182 + #4183 for the "崩溃后可定位" 场景。
问题场景
当前用户遇到 OOM 崩溃后,maintainer 唯一能拿到的信息是用户手动跑
/doctor memory的输出——但进程都崩了根本跑不了。实际用户报 bug 的流程:
FATAL ERROR: Reached heap limit期望行为
当运行时检测到内存压力达到 hard/critical 级别时,自动将一份诊断快照写入磁盘,即使后续进程崩溃,用户也能在 bug report 中提交这个文件。
设计参考
参考 Claude Code 的
heapDumpService.ts设计:具体方案
1. 自动落盘 diagnostics(核心)
当 #4403 的
MemoryPressureMonitor检测到 hard/critical 压力时:collectMemoryDiagnostics()收集数据.qwen/diagnostics/memory-{sessionId}-{timestamp}.json输出示例:
{ "timestamp": "2026-05-31T12:00:00Z", "sessionId": "abc-123", "trigger": "hard", "memory": { "rss": 3200000000, "heapUsed": 2800000000, "heapLimit": 4096000000 }, "session": { "historyEntries": 6401, "estimatedHistoryBytes": 157000000 }, "risks": ["heap_used_ratio_high", "large_session_history"], "suggestion": "Consider running /compress or restarting with a fresh session" }2. TUI 内存告警(辅助)
在状态栏/通知区域显示内存警告:
让用户在崩溃前有机会主动处理。
3. 可选 heap snapshot(高级)
critical 压力时,如果 Node 启动带了
--expose-gc,可选触发v8.writeHeapSnapshot()。但这是在 diagnostics JSON 之后执行——即使 snapshot 过程中 OOM 了,JSON 已经在磁盘上。工作量评估
核心原因:重活已做完(
collectMemoryDiagnostics450 行、#4403 MemoryPressureMonitor 605 行),只需串联 + 加输出。验收标准
.qwen/diagnostics/关联
collectMemoryDiagnostics()inpackages/core/src/utils/memoryDiagnostics.ts