fix(stats): dedup usage records by sessionId and skip in-progress writes#4995
Conversation
…tes (QwenLM#4994) Opening /stats during the first-ever turn followed by /clear (or exit) used to write the same sessionId twice into ~/.qwen/usage_record.jsonl, permanently inflating every aggregate (sessions / tokens / durations / tools / heatmap / projects) 2x for that session. Closes QwenLM#4994. Defense in depth: - Read side: loadUsageHistory dedups records by sessionId (last-wins), so any duplicates already on disk from this bug stop inflating future aggregates. - Write side: rebuildFromSessionJsonl skips the in-progress session when its sessionId is passed in; statsDataService threads currentSession.sessionId through. New duplicates are no longer created at the source. Regression coverage in usageHistoryService.test.ts mirrors the exact bug sequence (open /stats during first turn -> /clear -> re-open /stats) and asserts sessionCount=1, totalTokens=1600 for a single ~1.6k-token session.
|
@wenshao @DragonnZhang help cr please |
| } | ||
|
|
||
| export async function loadUsageHistory(): Promise<UsageSummaryRecord[]> { | ||
| function dedupBySessionId(records: UsageSummaryRecord[]): UsageSummaryRecord[] { |
There was a problem hiding this comment.
[Suggestion] dedupBySessionId silently discards duplicate sessionId records with no diagnostic trace. When duplicates are present (evidence of the #4994 bug having affected a user's data), there is no way to measure how many users were affected or detect recurrence.
| function dedupBySessionId(records: UsageSummaryRecord[]): UsageSummaryRecord[] { | |
| function dedupBySessionId(records: UsageSummaryRecord[]): UsageSummaryRecord[] { | |
| const map = new Map<string, UsageSummaryRecord>(); | |
| for (const r of records) map.set(r.sessionId, r); | |
| if (map.size < records.length) { | |
| debugLogger.debug( | |
| `dedupBySessionId: removed ${records.length - map.size} duplicate record(s)`, | |
| ); | |
| } | |
| return [...map.values()]; | |
| } |
— DeepSeek/deepseek-v4-pro via Qwen Code /review
There was a problem hiding this comment.
Adopted in a7c7206 — added a debugLogger.debug line that fires only when duplicates were actually removed:
if (map.size < records.length) {
debugLogger.debug(
`dedupBySessionId: removed ${records.length - map.size} duplicate record(s)`,
);
}This gives us a way to measure #4994's blast radius on real user data without changing aggregate behavior. Thanks for the catch.
| return [...map.values()]; | ||
| } | ||
|
|
||
| export async function loadUsageHistory( |
There was a problem hiding this comment.
[Suggestion] The new currentSessionId parameter in loadUsageHistory is silently ignored on the hot read path — it only takes effect in the rebuildFromSessionJsonl fallback (when usage_record.jsonl doesn't exist or has no v1 records). On the main path, the parameter is passed but the current session is still returned. loadStatsData handles its own filtering (line 251), so correctness is fine, but the API is misleading: a future caller passing currentSessionId would reasonably expect the session to be excluded from results regardless of file state.
Consider either: (a) applying the skip consistently on both paths inside loadUsageHistory and removing the redundant filter from loadStatsData, or (b) renaming the parameter to clarify its limited scope (e.g., skipSessionInRebuild).
— DeepSeek/deepseek-v4-pro via Qwen Code /review
There was a problem hiding this comment.
Fair point — the parameter was misleading. Went with renaming over consolidating, in commit a7c7206:
currentSessionId→skipSessionInRebuildin bothloadUsageHistoryandrebuildFromSessionJsonl- Zero behavior change, zero test-assertion change
I considered option (a) — apply the skip on both paths and drop the filter in loadStatsData — but kept (b) because:
loadStatsDatadoes defense-in-depth: it strips the persisted current-session record and pushes the live in-memorycurrentSession(with up-to-date metrics) back. Even ifloadUsageHistoryfiltered the current session out on the read path, the caller would still need to push the live record. So option (a) wouldn't actually let us delete the filter, only relocate it.- The existing read-side
dedupBySessionIdalready protects against a stale persisted current-session record being double-counted (last-wins). So the file-path skip in option (a) would be redundant with dedup. - Option (b) is a pure rename with no test-assertion diff and no public-API surface change.
Let me know if you'd still prefer (a) — happy to follow up.
…ip param Why: - dedupBySessionId silently dropped duplicate records, making it impossible to observe how many users were affected by the QwenLM#4994 bug. Now logs the removed count at debug level. - The new loadUsageHistory parameter was named currentSessionId but only controls the write-side skip during rebuildFromSessionJsonl — the read path ignores it. Renaming to skipSessionInRebuild makes the limited scope explicit; statsDataService still strips/re-pushes the live current session for defense-in-depth. PR review feedback on QwenLM#4995.
Real-Run Verification Report (PR #4995, head
|
| Build | Caller | File after ① | File after ② | ③ Aggregate |
|---|---|---|---|---|
origin/main @ 9b4ba60e7 |
legacy (no param) | 1 line | 2 lines, same sessionId |
🐛 sessions=2, tokens=3200 |
| PR | new statsDataService (passes session id) |
0 lines (write-side skip) | 1 line | ✅ sessions=1, tokens=1600 |
| PR | legacy (no param) | 1 line | 2 lines (dup written) | ✅ sessions=1, tokens=1600 (read-side last-wins dedup) |
The bug reproduces on main exactly as #4994 documents, and each defense layer holds independently: the write-side skip keeps the file clean for the new caller; the read-side dedup still yields correct aggregates even when a legacy caller lets the duplicate reach the file.
2. Real TUI E2E (tmux, mock model, fresh QWEN_HOME, identical flow per build)
Flow: one chat turn → /stats (first open → rebuild path) → Esc → /clear → /stats → Activity tab.
| Checkpoint | before (origin/main) |
after (PR) |
|---|---|---|
usage_record.jsonl after first /stats |
1 line (in-progress session written by rebuild) | 0 lines |
after /clear |
2 lines, byte-identical sessionId |
1 line |
/stats → Activity tab |
Sessions 2 … Tokens 6.4k |
Sessions 1 … Tokens 3.2k |
Same prompt, same mock usage — the dashboard itself shows the 2× inflation on main and the correct numbers on the PR build. (Per-session totals are 3.2k rather than 1.6k because each turn issues two completions against the mock — the chat response plus the auto-title call — each reporting 1600; the before/after relationship is the controlled variable.)
3. Existing-victim scenario (read-side fix for already-corrupted files)
Planted the real corrupted file from the before run (2 duplicate lines) into a fresh QWEN_HOME and opened /stats on the PR build: Activity shows Sessions 1 / Tokens 3.2k immediately, and the file is left untouched (still 2 lines — no rewrite), matching the PR's stated behavior for affected users.
Static checks
usageHistoryService.test.ts13/13 (10 pre-existing + 3 new regression tests, incl. the /stats permanently double-counts a session if /stats is opened during the first-ever turn (introduced by #4779) #4994 end-to-end case) · clistatsDataService+statsCommand+clearCommand45/45.- eslint clean on all changed files; repo
tsc --noEmitclean. - Upstream CI on this head: Lint + all three OS test jobs green.
- 30 commits behind
main,MERGEABLE, and nomaincommit since the merge-base touches either changed file — head is representative of the post-merge tree.
Notes (non-blocking)
- In
rebuildFromSessionJsonlthe skip applies only to the file write; the in-progress record is still returned to the caller, andloadStatsDatacontinues to filter the live session for display before re-adding its live snapshot — the layering is consistent. - The read-side dedup also de-duplicates inside the rebuild path itself (
dedupBySessionIdwraps both returns), so even pathological chat-dir layouts that produce same-id records across project dirs aggregate correctly.
Verdict
✅ Verified — recommend merge. The main bug reproduces deterministically and is fixed at both layers; the real dashboard goes from Sessions 2 / 6.4k to Sessions 1 / 3.2k on an identical flow; already-corrupted user files aggregate correctly with no rewrite; both review points from 06-11 are addressed on the head; tests, lint, typecheck, and full CI are green. No blockers from my side.
Repro
gh pr checkout 4995 && npm install && npm run build
git worktree add /tmp/before origin/main --detach # + npm install && npm run build
# standalone (issue #4994 steps): QWEN_HOME=<fresh> node repro.mjs <core-pkg> <main|skip|noskip>
# plants one in-progress chat JSONL (qwen-code.api_response uiEvent, 1600 tokens),
# then loadUsageHistory -> persistSessionUsage(same id) -> loadUsageHistory+aggregateUsage
# TUI E2E (tmux): mock OpenAI server with fixed usage {1000,600,1600};
# QWEN_HOME=<fresh> OPENAI_API_KEY/BASE_URL/MODEL -> one turn -> /stats -> /clear -> /stats(Tab)
# inspect $QWEN_HOME/usage_record.jsonl line count + sessionIds at each step
# victim: cp the before-run's corrupted usage_record.jsonl into a fresh QWEN_HOME,
# open /stats on the PR build -> Sessions 1, file still 2 lines中文版(完整翻译)
真实运行验证报告(PR #4995,head a7c7206)
在 macOS(Darwin 25.5.0,Node v22.22.2)上做了三层验证:针对构建产物 core dist 的 issue #4994 standalone 复现(before/after、三种调用方式)、tmux 中对确定性 mock 模型服务器的真实 TUI 端到端(固定 usage 数字,QWEN_HOME 完全隔离)、以及用真实 bug 产生的脏 usage_record.jsonl 做的存量受害用户场景。head 包含今早的 a7c7206(评审回应提交);我 2026-06-11 review 的两点都按回复所述落实——去重诊断日志(仅在确实移除重复时输出 debugLogger.debug)和两处签名的 currentSessionId → skipSessionInRebuild 改名。
1. Standalone 复现(issue #4994 步骤,构建产物 packages/core/dist)
同一脚本、每轮全新 QWEN_HOME;植入一个进行中的 chat JSONL(1600 token),然后:① loadUsageHistory(首次 /stats)、② 同一 id 的 persistSessionUsage(/clear/退出)、③ loadUsageHistory + aggregateUsage('all')。
| 构建 | 调用方式 | ① 后文件 | ② 后文件 | ③ 聚合 |
|---|---|---|---|---|
origin/main @ 9b4ba60e7 |
旧调用(无参数) | 1 行 | 2 行、同一 sessionId |
🐛 sessions=2, tokens=3200 |
| PR | 新 statsDataService(传入 session id) |
0 行(写侧跳过) | 1 行 | ✅ sessions=1, tokens=1600 |
| PR | 旧调用(无参数) | 1 行 | 2 行(写入了重复) | ✅ sessions=1, tokens=1600(读侧 last-wins 去重) |
bug 在 main 上按 #4994 的描述确定性复现;两层防御各自独立成立:写侧跳过让新调用方的文件保持干净;读侧去重保证即使旧调用方让重复进了文件,聚合仍然正确。
2. 真实 TUI 端到端(tmux、mock 模型、全新 QWEN_HOME、两构建同一流程)
流程:一轮对话 → /stats(首次打开 → rebuild 路径)→ Esc → /clear → /stats → Activity 标签页。
| 检查点 | before(origin/main) |
after(PR) |
|---|---|---|
首次 /stats 后的 usage_record.jsonl |
1 行(rebuild 写入进行中会话) | 0 行 |
/clear 后 |
2 行、sessionId 完全相同 |
1 行 |
/stats → Activity 页 |
Sessions 2 … Tokens 6.4k |
Sessions 1 … Tokens 3.2k |
同样的 prompt、同样的 mock usage——main 上仪表盘本身显示 2× 膨胀,PR 构建数字正确。(单会话总量是 3.2k 而非 1.6k,是因为每轮对 mock 发起两次补全——对话响应 + 自动标题调用——各报 1600;before/after 的相对关系才是受控变量。)
3. 存量受害用户场景(读侧修复已脏文件)
把 before 运行产生的真实脏文件(2 行重复)植入全新 QWEN_HOME,用 PR 构建打开 /stats:Activity 立即显示 Sessions 1 / Tokens 3.2k,且文件保持原样(仍 2 行——不重写),与 PR 对受影响用户的声明一致。
静态检查
usageHistoryService.test.ts13/13(10 个既有 + 3 个新回归测试,含 /stats permanently double-counts a session if /stats is opened during the first-ever turn (introduced by #4779) #4994 端到端用例)· clistatsDataService+statsCommand+clearCommand45/45。- 改动文件 eslint 干净;全仓
tsc --noEmit干净。 - 该 head 的上游 CI:Lint + 三平台测试全绿。
- 落后
main30 个提交,MERGEABLE,且 merge-base 之后没有任何 main 提交触碰这两个文件——head 可代表合并后形态。
备注(非阻塞)
rebuildFromSessionJsonl的跳过只作用于文件写入;进行中的记录仍会返回给调用方,loadStatsData继续在展示前过滤活动会话并单独加入其实时快照——分层自洽。- 读侧去重同样包住了 rebuild 路径的返回值(
dedupBySessionId包裹两个 return),即使病态的 chats 目录布局在不同项目目录下产生同 id 记录,聚合也正确。
结论
✅ 验证通过——建议合并。 main 上的 bug 可确定性复现且两层修复都有效;同一流程下真实仪表盘从 Sessions 2 / 6.4k 变为 Sessions 1 / 3.2k;已脏的用户文件无需重写即可正确聚合;我 06-11 的两条评审意见已在 head 上落实;测试、lint、typecheck 与完整 CI 全绿。我这边没有阻塞项。
|
@wenshao @DragonnZhang request re-review please |
qqqys
left a comment
There was a problem hiding this comment.
Critical-only review: no critical issues found. The session usage dedup/write-skip fix is scoped to stats history handling, prior review feedback was addressed on a7c7206, and CI is green across lint, CodeQL, and all OS test jobs.
Summary
Closes #4994.
Opening
/statsduring the first-ever turn followed by/clear(or process exit) used to write the samesessionIdtwice into~/.qwen/usage_record.jsonl, permanently inflating every/statsaggregate (sessions / tokens / durations / tool counts / heatmap / projects) 2× for that session. The mechanism, deterministic repro, and code-level walkthrough are in #4994 (the bug is real onmain@ac040d0a6— repro script in that issue).This is also the same
aggregateUsage-dedup Critical raised by @wenshao in the review of #4779.Fix — defense in depth
usageHistoryService.ts—loadUsageHistory): records are deduped bysessionIdwith last-wins before being returned. Users whoseusage_record.jsonlalready contains duplicates from this bug stop seeing inflated aggregates immediately, without any file rewrite.usageHistoryService.ts—rebuildFromSessionJsonl(currentSessionId?)): when called with the in-progress session's id, that session is skipped during the file write.persistSessionUsage()will write its authoritative record at/clear/ exit, so the in-progress snapshot would only ever be a transient duplicate.statsDataService.tsthreadscurrentSession?.sessionIdthrough.Both fixes are needed: read-side alone leaves the file growing duplicates forever; write-side alone doesn't help users whose files are already corrupted by the bug.
Test plan
packages/core— new regression tests inusageHistoryService.test.ts(3 cases):currentSessionIdis passed —usage_record.jsonlstays empty untilpersistSessionUsage/statsduring first turn →/clear→/stats) assertssessionCount === 1,totalTokens === 1600usageHistoryService.test.ts— 13/13 pass (10 pre-existing + 3 new)statsDataService.test.ts— 17/17 passstatsCommand.test.ts— 13/13 passclearCommand.test.ts— 15/15 passtsc --noEmitcore + cli passprettier --write+eslinton changed files clean✅ no double count(was🐛 BUG CONFIRMED: sessions=2, tokens=3200)Notes
loadUsageHistory()is nowloadUsageHistory(currentSessionId?: string). The parameter is optional and defaults to undefined, so the only behavior change for callers that don't pass it is the dedup pass (which is what we want).aggregateUsageCritical raised on feat(stats): add interactive /stats dashboard with cross-session tracking #4779 by @wenshao without requiring any change toaggregateUsageitself.中文小结
修复:PR #4779 引入的
/stats会话双计 bug(issue #4994)。在首次 turn 期间打开过一次/stats再/clear/退出,该会话会被永久双计。两侧防御:
loadUsageHistory按sessionId做 last-wins 去重,立即修复磁盘已脏的存量用户。rebuildFromSessionJsonl(currentSessionId?)跳过进行中的会话,避免新增重复;statsDataService把当前 sessionId 透传过来。覆盖:新增 3 个回归测试还原 issue #4994 复现路径,全部相关测试 45/45 通过,typecheck/lint/format 均干净,standalone repro 脚本验证从
🐛 sessions=2, tokens=3200变为✅ no double count。