perf(core): bound session-list metadata reads to head/tail 64KB; pool buffer; lazy message count#3897
Conversation
`listSessions` previously called `countSessionMessages` per file, which streamed the entire JSONL through `readline` to count unique user/assistant UUIDs. For a project with N sessions averaging M bytes each, every /resume open paid O(N · M) wall time before showing the picker — by far the dominant cost once a project accumulated many multi-MB sessions. This change: - Drops the per-file count from listSessions / findSessionsByTitle. `SessionListItem.messageCount` is now optional. Callers that need a count call the new public `SessionService.countSessionMessages (sessionId)` lazily — typically only when a SessionPreview panel is about to display the badge. - Pools a single 64KB tail-read buffer across the per-file metadata reads in listSessions / findSessionsByTitle, mirroring the pattern in claude-code's `enrichLogs`. The two helpers in `sessionStorageUtils` (`readLastJsonStringFieldSync` and `readLastJsonStringFieldsSync`) accept an optional caller-owned scratch buffer; one-off callers (rename, single-session lookup) pass nothing and keep the original alloc behaviour. - Updates SessionPicker's row metadata to omit the "N messages" segment when `messageCount` isn't available, keeping the visual layout intact. Tests: - Pin that listSessions does not populate messageCount (regression guard against silently re-introducing the per-file scan). - Smoke test the buffer-pool plumbing — same caller-owned buffer handed to two reads of different file sizes returns both correct values without state bleed. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Tightens the title-write / title-read invariant so the picker's
metadata read is bounded to a fixed 2 × 64KB per file, regardless
of session length, with no fallback that scales with file size.
Writer (`ChatRecordingService`):
- New `bytesSinceTitleAnchor` counter and `TITLE_REANCHOR_BYTES`
threshold (32 KB, half of LITE_READ_BUF_SIZE).
- `appendRecord` now updates the counter and, once it crosses the
threshold while a title is set, re-appends a fresh `custom_title`
record to EOF (`reanchorTitle`). The recursive append routes
back through the same tracking path, which sees the title
record and resets the counter to zero.
- Sessions that never set a title pay zero overhead — the early
return in `updateTitleAnchorTracking` short-circuits.
Reader (`sessionStorageUtils`):
- `readLastJsonStringFieldSync` / `readLastJsonStringFieldsSync`
now read the file's last 64 KB, fall back to the first 64 KB
on miss, and return `undefined` if neither contains the field.
The previous Phase-2 streaming full-file scan (capped at 64 MB)
is removed entirely.
- The pooled scratch buffer (already optional) now backs both the
tail and head reads — only one allocation per `listSessions`
page even with the head fallback firing.
- `MAX_FULL_SCAN_BYTES` constant deleted (unused).
The two changes are coupled: the reader's tighter bound only works
because the writer guarantees the title stays in tail. A title
buried mid-file is now intentionally `undefined` (the picker
falls back to `firstPrompt` for display) rather than triggering a
full-file scan that would freeze the UI on long agent transcripts.
Tests:
- `chatRecordingService.customTitle.test.ts` — three scenarios
pinning the re-anchor invariant: threshold trigger, no-spurious
on no-title sessions, no-trigger on small write bursts.
- `sessionStorageUtils.test.ts` — replaces Phase-2 tests with
head-window fallback tests; pins the new "buried beyond both
windows returns undefined" contract; updates buffer-pool reuse
test to cover both tail and head reads with the same scratch.
E2E coverage (35/35 scenarios on a separate harness against real
fs): R1–R10 reader contract, M1–M4 multi-field, W1–W7 writer
re-anchor, L1–L6 listing latency + regression pins, S1 stress
(200 × 3 MB), C1 concurrency, I1–I5 writer/reader integration.
Measured 2.6× speedup on listSessions(50) over 50 × 4 MB
sessions vs the legacy `countSessionMessages` baseline; speedup
scales linearly with average file size.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
wenshao
left a comment
There was a problem hiding this comment.
Additional findings (not mapped to specific diff lines):
- [Critical]
packages/core/src/services/sessionService.ts: New public APIcountSessionMessages(sessionId)has zero test coverage — no tests for valid sessionId returning correct count, invalid sessionId returning 0, or file-not-found graceful degradation. - [Suggestion]
packages/core/src/utils/sessionStorageUtils.test.ts:readLastJsonStringFieldsSyncgained ascratchBufferparameter but lacks a corresponding buffer-reuse test (the single-field variant has one). - [Suggestion]
packages/core/src/services/chatRecordingService.customTitle.test.ts: The falsytitleSourcebranch inreanchorTitleis untested — only the'manual'path is covered.
— deepseek-v4-pro via Qwen Code /review
- countSessionMessages: valid id counts unique user/assistant uuids and ignores other types/malformed lines; invalid id short-circuits without filesystem access; ENOENT degrades to 0 instead of bubbling. - readLastJsonStringFieldsSync: mirror the single-field variant's scratch-buffer reuse test across a tail-hit then head-fallback to catch any decode that ignores bytesRead. - ChatRecordingService re-anchor: legacy resumed session (source undefined) must omit titleSource on threshold-triggered re-anchor, not silently reclassify as 'manual'. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com>
`String.length` undercounts the on-disk size of multi-byte payloads (CJK, emoji are 1 UTF-16 unit but 3 UTF-8 bytes), so a session full of CJK content could push >96KB of writes past the last anchor before the 32KB threshold thinks it has — silently drifting the title past the 64KB tail window the picker scans. Switch the byte counter to `Buffer.byteLength(..., 'utf8')` for parity with `jsonl.writeLine`. Also reset `bytesSinceTitleAnchor` to 0 in the reanchor catch: without it, a failing reanchor pins the counter at the threshold and turns a single transient I/O fault into a per-record retry storm. One missed anchor is the right tradeoff — finalize() will re-emit on the next lifecycle event. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com>
The 64KB head-window fallback in `readLastJsonStringFieldSync` / `readLastJsonStringFieldsSync` reads a fixed slice and hands it straight to the extractor — its trailing bytes can fall mid-record. A partial line whose `customTitle` value happens to close inside the buffer but whose body extends past 64KB would otherwise win the latest-match race and surface as the picker's title. Drop everything past the last `\n` before extracting (only when the buffer is shorter than the file — a small file is necessarily whole- line). Honors the original Phase-2 contract that only complete lines get a vote, without paying for the deleted full-file scan. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com>
…ession-list-perf
wenshao
left a comment
There was a problem hiding this comment.
Follow-up after the three fix commits (2f72b925 + e1f502e2 + f659cd61).
Prior findings — verified fixed
| Prior concern | Fix | How verified |
|---|---|---|
[Critical] UTF-16 vs UTF-8 byte count in bytesSinceTitleAnchor |
chatRecordingService.ts:754 now uses Buffer.byteLength(JSON.stringify(record), 'utf8') + 1 |
New CJK test ('汉'.repeat(1500) × 12) — under threshold by UTF-16, over by UTF-8, anchor fires only when bytes are counted right |
[Critical] reanchorTitle failure pinned counter at threshold → retry storm |
chatRecordingService.ts:789 resets bytesSinceTitleAnchor = 0 in catch |
New test injects a sync throw on every custom_title appendRecord and asserts exactly one failed re-anchor across 25 bulk records |
| [Suggestion] Head window could pick up a partial straddling record | sessionStorageUtils.ts:314-317 and :399-402 truncate rawHead to last newline before extraction |
New "phantom partial trailing line" test (record begins inside the head window, closing quote past 64KB) — without the trim, phantom would win over complete |
[Critical] countSessionMessages had zero coverage |
New describe('countSessionMessages', ...) block in sessionService.test.ts |
Three cases: counts uniques + ignores non-message types + tolerates malformed lines, rejects non-UUID input without touching disk, returns 0 on ENOENT |
[Suggestion] scratchBuffer reuse not pinned |
New sentinel-byte reuse tests in both readLastJsonStringFieldSync and readLastJsonStringFieldsSync blocks |
Pre-fill buffer with 0x55, drive tail-hit → head-fallback on different file sizes; a buggy decode that read past bytesRead would corrupt the result |
[Suggestion] Falsy titleSource re-anchor branch untested |
New "omits titleSource on re-anchor when source is unknown" test | Asserts titleSource key is absent (not present-and-undefined) on the re-anchored record |
Cross-cutting checks
- The recursive
reanchorTitle → appendRecord → updateTitleAnchorTracking → counter=0reset path closes correctly on both success and sync-throw paths. writeChainserialization preserves on-disk record ordering even when re-anchor is queued mid-turn.- Constructor's resume path (
finalize()at:554) re-anchors the title on every session load, so process restarts can't leave the counter in a stale state — closes the "buried-on-resume" worry implicit in the original design. custom_titlerecords becoming intermediate parents in the parentUuid chain is consistent with the pre-existingfinalizepattern, andreconstructHistoryalready filters them out via therecord.message === undefinedpath atsessionService.ts:1142and the explicitsystemskip at:1125(compressed-history branch).
One tiny nit inline (purely optional). LGTM otherwise.
…ession-list-perf # Conflicts: # packages/core/src/services/sessionService.ts
…, pin perf contract Address QwenLM#3897 follow-up review findings: - SessionPreview footer no longer drops the message count when listSessions omits it. The prop is the override path; default falls back to a unique user/assistant uuid count derived from the already-loaded conversation, matching countSessionMessages semantics with zero extra disk I/O. - countSessionMessages now scopes to the current project, mirroring the first-record cwd check in deleteSession/renameSession/loadSession. A valid sessionId from another project sharing the chats dir no longer bypasses project boundaries on lazy count. - New regression tests: - SessionPicker row renders cleanly with messageCount === undefined (no "messages"/"undefined"/dangling separator) - findSessionsByTitle perf contract: matches have messageCount undefined and fs.createReadStream is never called - countSessionMessages: cross-project sessions return 0 without streaming, empty file returns 0 - Update corruption-recovery tests to call private countSessionMessagesFromPath (the streaming entry point) instead of the public sessionId-shaped API the merge from main pointed them at. Co-Authored-By: Qwen-Coder <noreply@qwen.com>
|
Follow-up after Findings — addressed
Cross-cutting
The optional |
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] Two additional stale comments reference removed behavior:
-
sessionStorageUtils.ts:226— Section header says// File I/O — tail-first scan with full-file fallback, but the full-file fallback was replaced by a head-window. Should say// File I/O — tail-first scan with head-window fallback. -
sessionService.ts:180— JSDoc onreadSessionTitleFromFilesays"falls back to a full-file scan when the tail has no match", but the underlying function now uses a bounded head window, not a full-file scan.
[Suggestion] ACP renameSession handler bypasses ChatRecordingService. The ACP agent's rename path (acpAgent.ts:~494) creates a new SessionService and writes the title directly via jsonl.writeLineSync. If a ChatRecordingService is actively recording the same session, its currentCustomTitle is never updated, so the new re-anchor invariant re-emits the old title every 32KB of writes, silently reverting the rename. Other ACP handlers correctly look up the live session; the rename handler should route through ChatRecordingService.recordCustomTitle when a live session exists.
— glm-5.1 via Qwen Code /review
| // per-file metadata read. Without pooling, each session in the | ||
| // page allocs+GCs a fresh 64KB Buffer; for a typical page of 20 | ||
| // that's ~1.3MB churn per /resume open. Phase-2 (full-file | ||
| // fallback) inside the helper still allocs per chunk because |
There was a problem hiding this comment.
[Suggestion] This comment references Phase-2 (full-file fallback) and Phase-2 is rarely entered, but Phase-2 was replaced by a single head-window read in this PR. The helper now does one bounded read with the shared scratchBuffer, not chunk-based streaming. The last two lines of this comment should be removed.
— glm-5.1 via Qwen Code /review
The session metadata reader uses two bounded 64KB windows (tail + head) since the perf rework on this branch — never a full-file scan. Two comments still described the prior behavior. Reported in MR review. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
ACP renameSession constructed a fresh SessionService and wrote the custom_title record straight to disk. When the same session was live in this process, ChatRecordingService.currentCustomTitle stayed at the old value, so the next title re-anchor (every 32KB) or finalize() re-emitted the stale title at EOF and silently reverted the rename. Now we look up the live ChatRecordingService first and route through recordCustomTitle, which keeps the in-memory cache and the on-disk record in sync. The SessionService path remains for the non-live case (e.g., another client renaming a backgrounded session). Reported in MR review. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
wenshao
left a comment
There was a problem hiding this comment.
Additional findings from second-opinion review (glm-5.1):
-
[Suggestion]
sessionStorageUtils.test.ts— The multi-field variantreadLastJsonStringFieldsSynclacks a "buried beyond both head and tail windows" test. The single-field variant has an explicit anti-test that constructs a file >2×LITE_READ_BUF_SIZEwith the target in the middle, assertingundefined. The multi-field variant has the same head/tail windowing logic but thereturn emptyResultpath is untested. A regression in the multi-field head-window logic could silently break title+source extraction with no test failing. -
[Note — false positive in open comment (acpAgent.ts:500)] The open Critical comment claims
this.sessions.get(sessionId)?.getConfig().getChatRecordingService()throwsTypeErrorwhen the session is absent. This is incorrect — JavaScript's?.short-circuits the entire remaining chain, soundefined?.getConfig().getChatRecordingService()evaluates toundefinedwithout error. Verified by two independent review agents. No code change needed; consider resolving the comment.
| * Two bounded windows, never a full-file scan: | ||
| * 1. Scan the last LITE_READ_BUF_SIZE bytes of the file. This is the | ||
| * common path because `ChatRecordingService` re-anchors metadata | ||
| * records to EOF every {@link TITLE_REANCHOR_BYTES} (≤ tail-window |
There was a problem hiding this comment.
[Suggestion] {@link TITLE_REANCHOR_BYTES} references a constant defined in chatRecordingService.ts (a different package). The {@link} won't resolve — IDE hover/intellisense can't navigate to it. Replace with an inline description like "every 32KB (TITLE_REANCHOR_BYTES)".
— glm-5.1 via Qwen Code /review
| * | ||
| * Without this, a long agentic turn that streams >64KB of tool | ||
| * output could push the only `custom_title` record past the 64KB | ||
| * tail window, forcing the picker into a full-file fallback. |
There was a problem hiding this comment.
[Suggestion] This JSDoc says "forcing the picker into a full-file fallback," but the full-file fallback was removed by this PR — the picker now does a bounded head-window read (64KB from offset 0) or returns undefined. The stale wording could mislead a future reader into thinking full-file scanning still exists.
| * tail window, forcing the picker into a full-file fallback. | |
| forcing the picker into a head-window fallback (or returning undefined if the title is beyond both windows). |
— glm-5.1 via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
Approving against HEAD 3ca0829a.
6 prior [Critical] threads — all resolved:
- 4 fixed: UTF-8 byte counting (
e1f502e2), reanchor failure reset (e1f502e2),lastRecordUuidpreserved across reanchor (96ed42cb), preview message count via SessionPreview in-memory compute (2b7b656f). - 1 documented as bounded acceptable cost:
statSync/readSyncrace +readLatestTailIfGrownone-shot recovery, JSDoc spells out the trade-off. - 1 retracted: my mis-read of optional-chain short-circuit semantics on
acpAgent.ts:500(Session.getConfig() is non-nullable, so the chain is safe as written).
11 [Suggestion] threads — all resolved with explicit dispositions:
- 2 fixed: head-window line-truncation (
f659cd61), project-scope check oncountSessionMessages(2b7b656f). - 3 author trade-off accepted: defensive
: rawHeadternary, double-stringify cost, 64KB literal in TITLE_REANCHOR_BYTES comment. - 6 deferred to follow-up: 4 test-coverage gaps (picker undefined-messageCount render, findSessionsByTitle perf contract, ACP recording-disabled fallback, multi-field bounded read tail-coverage edge), async-write outcome surfacing in
recordCustomTitle, stalefileSizeguard at line 334 for the rare <64KB→>128KB growth race.
CI green on all four platforms. Ready to land.
Summary
/resumeopen time scales with total disk bytes today, dominated by per-filereadlineto count messages and an unbounded Phase-2 fallback when title metadata isn't in the tail window. This PR makes the read path fixed-cost in file size for both common and worst case:listSessionsandfindSessionsByTitle.SessionListItem.messageCountis now optional; a new publicSessionService.countSessionMessages(sessionId)lets the SessionPreview panel get an exact count lazily.enrichLogspattern, eliminates ~1.3 MB of per-/resumeBuffer churn.tail 64KB → head 64KB → undefined. The previous full-file scan (capped at 64 MB) is gone.ChatRecordingServiceso the title stays in the tail window between lifecycle events. Every 32 KB of non-title content triggers a freshcustom_titleappend at EOF; sessions that never set a title pay zero overhead.The reader's tighter bound and the writer's re-anchor are coupled — together they replace "scan as much as it takes" with "scan a fixed 128 KB; the writer maintains the invariant." A title that somehow ends up buried mid-file (legacy data, crashed session) returns
undefinedand the picker falls back to first-prompt for display, instead of freezing the UI on a multi-MB scan.Why this matters
For users with hundreds of sessions or large agent transcripts,
/resumewas visibly stalled at "Loading sessions…" for seconds. The full-file readline was the only step that scaled with file size rather than file count. After this PR, listing latency depends only on session count and is bounded per file.Compatibility
SessionListItem.messageCountbecomes?: number. CLI / VSCode-companion / WebUI consumers already handle the optional case (?? 0,?? messages.length,typeof === 'number'); only the row badge inSessionPickerneeded a guard, included here.countSessionMessages(sessionId)keeps the door open for any future preview-style consumer that wants the precise count without re-deriving it.custom_titlerecord +titleSourcediscriminator and both ride the re-anchor invariant — the picker's "dim auto titles" affordance stays intact across long sessions.What stays out
Test plan
Unit
tsc --noEmitclean for bothpackages/coreandpackages/clisessionService.test.ts,sessionService.rename.test.ts,sessionStorageUtils.test.ts,chatRecordingService.test.ts,chatRecordingService.customTitle.test.ts,chatRecordingService.autoTitle.test.tsmessageCount, head-window fallback, "buried beyond both windows returns undefined", scratch buffer reuse across tail+head reads, three re-anchor invariant scenariosE2E
A standalone harness against real
node:fswrites covers every contract this PR introduces:titleSourcepreservationmessageCount=undefined,findSessionsByTitle(second use site)listSessions)loadSessionend-to-endHeadline numbers
listSessions(50)on 50 × 4 MB sessionscountSessionMessagesbaselinefindSessionsByTitlesingle matchlistSessionsThe 2.6× speedup is measured on 4 MB files. The gap scales linearly with average file size — the new path reads a fixed 128 KB per file regardless. Estimated speedups by session size: ~10× at 20 MB, ~25× at 50 MB.