fix(memory): prevent unbounded growth + dashboard i18n fixes#1605
fix(memory): prevent unbounded growth + dashboard i18n fixes#1605Bernardxu123 wants to merge 3 commits into
Conversation
0bcf931 to
b7dd25c
Compare
|
CI 失败原因是 dashboard i18n 键缺失(TS2345),这是 PR #1418 原有的问题,与本次内存修复无关。 本次修复仅涉及:
本地验证:npm run lint ✅ | npm run typecheck ✅ dashboard i18n 问题需要单独修复(补充缺失的翻译键到 dashboard/src/i18n/en.ts)。 |
b7dd25c to
f190648
Compare
…SessionStats, and card state Four targeted fixes for memory growth during long sessions: 1. JobRegistry: auto-evict oldest completed jobs when count > 50 2. AppendOnlyLog: auto-compact when entries > 500, keep recent 200 3. SessionStats.turns: rolling window of 200 turns, older costs aggregated into carryover 4. state.cards: trim oldest settled cards when count > 150; Terminal scrollback retains visual output, only React state freed Fixes esengine#1571
dd79050 to
01d0804
Compare
# Conflicts: # dashboard/src/i18n/en.ts
|
Thanks @Bernardxu123 — direction's right, AppendOnlyLog and SessionStats both land where we want. One real bug + one knob to tighten in JobRegistry before this is good to merge: Bug — Knob — Optional but tidier — once the cap drops to a small N, you can drop the 30s time throttle entirely and just gate on count inside Fix the init bug at minimum and I'll merge. |
…1217) The previous implementation set tab.switching=true unconditionally in session_load/new_chat handlers. When no turn was in flight (the common case), the flag stayed true and suppressed events for the first turn in the new session. Fix: only set switching=true when tab.aborter exists (live turn).
|
PR 描述已更新,详细说明了:
本地验证:lint ✅ | typecheck ✅ | test ✅ |
|
Thanks for the update — the memory-leak direction is right (AppendOnlyLog + SessionStats look good), but the branch picked up a problem since the last review and the JobRegistry blockers I flagged earlier aren't addressed yet. Three things before I can merge: 1. Branch silently reverts #1582 (the recently-merged session-switch fix) This PR forked before #1582 was merged. The branch independently added the } finally {
tab.aborter = null;
if (!tab.switching) {
// ... emit $turn_complete, emitSessions, emitBalance, $plan_cleared, QQ
}
tab.switching = false;
}The PR's finally block has none of that. The 3-way merge produces a clean tree (no conflict markers), but the semantic result is the Fix: 2. Same as the last round — the field initializer means the first 30s of process lifetime short-circuits every 3. Each retained completed job pins the 64KB output buffer + ChildProcess ref. 50 × 64KB ≈ 3.2MB held even when cleanup does fire. The model doesn't need 50 historical entries from Optional follow-up once #3 is in: with a small cap, the 30s throttle stops earning its keep — gate the cleanup on count inside Push a rebase + the two-line fix in |
|
Applied all your feedback in a clean branch (fix/memory-leaks-v2 → PR #1620):
|
|
Bumping this with urgency — the branch has drifted further since the last review and now silently reverts four PRs that landed in main in the last hour. Do not merge this as-is.
GH still says Example: the diff shows your branch removing What you need to do:
Heads up — main is moving fast this week; if the rebase takes more than a few hours, you'll likely need a second one. Push when ready and I'll review the diff again. |
|
Closing this in favor of #1620, which carries the JobRegistry portion in a clean form. Why I'm closing rather than waiting for a rebase: in its current state this branch silently reverts four merges that landed today (#1572, #1582, #1612, #1614) plus several older ones. The other two pieces of the original proposal still look good and are worth landing — please open them as separate PRs against current main:
Each one ≤ 20 lines and orthogonal to the other, so two small PRs are easier to review than a combined one. Thanks for sticking with this — splitting it up is the right shape. |
Memory Leak Fixes
Fixes #1571
Problem
Long-running sessions (especially in free mode) experience unbounded memory growth. Node.js process consumes 2GB+ and grows by ~2MB/second during task execution.
Root Cause Analysis
Four data structures grow without cleanup:
/compactsummary()callFixes
1. JobRegistry Self-Cleanup
Auto-evict oldest completed jobs when count > 50. Cleanup runs on job completion and list() calls. Retains 50 most recent completed jobs for list_jobs / job_output lookups.
2. AppendOnlyLog — Fold Trigger
In ContextManager.decideAfterUsage(), added log length check. When log exceeds 500 entries (~1MB memory), triggers existing fold mechanism. Fold compresses old messages into summary, keeping recent 200 entries.
3. SessionStats Rolling Window
Rolling window of 200 turns. Older turns dropped but costs/caches aggregated into carryover. Maintains accurate session totals while limiting memory.
4. state.cards + Static
Existing elideOldCardContent() already stubs heavy content for cards older than RECENT_CARDS_WINDOW (200). Terminal scrollback retains visual output after elision. No additional trimming needed.
Dashboard i18n Fixes
Added missing translation keys to dashboard/src/i18n/en.ts (merged from main).
Testing