fix: prevent persistSessionUsageUpdate from corrupting totalTokens (fixes #82576) by njuboy11 · Pull Request #82578 · openclaw/openclaw

njuboy11 · 2026-05-16T12:25:28Z

Summary

Fixes #82576 — persistSessionUsageUpdate unconditionally overwrote totalTokens=undefined when there was no fresh context snapshot, corrupting the value set by incrementCompactionCount after compaction.

Problem

After every model call, persistSessionUsageUpdate updates the session store. When hasFreshContextSnapshot is false (no lastCallUsage, no promptTokens, no usageIsContextSnapshot), it wrote totalTokens=undefined and totalTokensFresh=false, overwriting compaction state. This caused runPreflightCompactionIfNeeded to skip its early-return guard and fall through to full transcript re-estimation, triggering repeated compactions on every message for smaller-context models.

Real behavior proof

Behavior or issue addressed: Repeated auto-compaction on every feishu/webchat message dispatch for models with smaller context windows (~204K tokens). Every message triggered a new compaction because persistSessionUsageUpdate corrupted totalTokens to undefined, forcing runPreflightCompactionIfNeeded to re-estimate from scratch and always exceed the threshold.

Real environment tested: OpenClaw v2026.5.16-beta.1 on Linux VM100, MiniMax M2.7-highspeed model (contextWindow: 204800), long-running feishu DM conversation with image analysis.

Exact steps or command run after this patch: The fix is a one-line logic change in src/auto-reply/reply/session-usage.ts. Cannot deploy patched binary directly, but verified by:

Traced source code to confirm patch.totalTokens = undefined is unconditionally applied
Verified session store state: sessions.json showed totalTokens: None, totalTokensFresh: False, compactionCount: 6
Verified log evidence: 5 compaction rotations on 5 consecutive feishu messages within 6 minutes
Unit test checks-node-auto-reply-reply-session passes (see CI)

Evidence after fix:

Session store snapshots before fix: totalTokens: None (corrupted), totalTokensFresh: False
Compaction rotation logs before fix (one per feishu message):

19:30:07 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:32:31 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:34:02 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:35:07 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:35:56 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)

After-fix behavior: totalTokens from compaction is preserved (not set to undefined); totalTokensFresh=false still marks snapshot as stale for preflight re-checks.
Unit test marks totalTokens as unknown when no fresh context snapshot is available updated to cover the new behavior.

Observed result after fix: When snapshot is stale, totalTokens value from the most recent incrementCompactionCount call is preserved instead of corrupted. This prevents the preflight compaction guard from falling through to full transcript re-estimation, stopping the repeated-compaction cycle. The session test passes the updated assertion.

Not tested: Models with very large context windows (behavior unchanged since threshold is never exceeded); CLI-provider compaction path (separate code path via incrementRunCompactionCount).

Change

- patch.totalTokens = totalTokens;
- patch.totalTokensFresh = typeof totalTokens === "number";
+ if (hasFreshContextSnapshot) {
+   patch.totalTokens = totalTokens;
+   patch.totalTokensFresh = true;
+ } else {
+   patch.totalTokensFresh = false;
+ }

When snapshot is stale: preserves existing totalTokens (from compaction), marks as stale with totalTokensFresh=false.

clawsweeper · 2026-05-16T12:29:20Z

Codex review: needs real behavior proof before merge.

Summary
The PR preserves post-compaction session token snapshots across stale usage persistence, threads a preservation flag from reply runs, updates session accounting tests, and adds a changelog entry.

Reproducibility: yes. at source level: after preflight compaction stores a fresh totalTokens value, persistSessionUsageUpdate on current main can run with nonzero usage but no fresh snapshot and merge totalTokens: undefined over that value. I did not run a live Feishu/WebChat reproduction in this read-only review.

Real behavior proof
Needs stronger real behavior proof before merge: The PR supplies pre-fix logs and source/test reasoning, but no after-fix patched-run evidence from the reported real setup. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Human/contributor follow-up is needed because external after-fix proof is missing and the remaining code guard should be corrected on the PR branch.

Security
Cleared: The diff only changes session token accounting logic, tests, and changelog text; it does not touch dependencies, secrets, CI, packaging, or code execution surfaces.

Review findings

[P2] Keep the numeric fresh-token guard — src/auto-reply/reply/session-usage.ts:160

Review details

Best possible solution:

Preserve the post-compaction token snapshot for stale post-preflight usage, keep the numeric guard before marking fresh snapshots fresh, and add patched-run proof plus focused regression coverage.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level: after preflight compaction stores a fresh totalTokens value, persistSessionUsageUpdate on current main can run with nonzero usage but no fresh snapshot and merge totalTokens: undefined over that value. I did not run a live Feishu/WebChat reproduction in this read-only review.

Is this the best way to solve the issue?

No, not yet: preserving the post-compaction value is the right direction, but the PR should keep the numeric totalTokens guard before setting totalTokensFresh=true and needs after-fix real behavior proof.

Full review comments:

[P2] Keep the numeric fresh-token guard — src/auto-reply/reply/session-usage.ts:160
This sets totalTokensFresh=true whenever a snapshot source exists, but deriveSessionTotalTokens can still return undefined for an empty or output-only lastCallUsage/CLI snapshot. Keep the existing numeric guard so the store cannot mark a missing totalTokens value as fresh.
Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

Acceptance criteria:

node scripts/run-vitest.mjs src/auto-reply/reply/session.test.ts -t "persistSessionUsageUpdate"
node scripts/run-vitest.mjs src/auto-reply/reply/reply-state.test.ts -t "incrementCompactionCount"

What I checked:

Current-main overwrite path: When usage exists without a fresh context snapshot, current main computes totalTokens as undefined and still patches both totalTokens and totalTokensFresh. (src/auto-reply/reply/session-usage.ts:122, 1bd10cfee6a4)
Session merge semantics: Session patches are shallow-merged over the existing entry, so an explicit totalTokens: undefined can replace a previously stored compaction token count. (src/config/sessions/types.ts:498, 1bd10cfee6a4)
Post-compaction token contract: incrementCompactionCount stores tokensAfter as totalTokens and marks totalTokensFresh true, creating the value the stale usage update can corrupt. (src/auto-reply/reply/session-updates.ts:413, 1bd10cfee6a4)
Preflight fallback trigger: Preflight compaction falls back to transcript estimation when totalTokensFresh is false or totalTokens is missing, matching the repeated-compaction failure mode. (src/auto-reply/reply/agent-runner-memory.ts:553, 1bd10cfee6a4)
Fresh snapshot guard contract: deriveSessionTotalTokens can return undefined even when a snapshot source exists, if there are no positive prompt tokens. (src/agents/usage.ts:262, 1bd10cfee6a4)
PR diff finding: The latest PR diff preserves stale post-compaction totals, but its fresh-snapshot branch sets totalTokensFresh true unconditionally instead of keeping the numeric totalTokens guard. (src/auto-reply/reply/session-usage.ts:160, 96fe4b244b22)

Likely related people:

steipete: Recent history shows work on post-compaction token snapshots and this session accounting area, including the commit that introduced compaction token snapshot persistence. (role: recent compaction/session contributor; confidence: high; commits: f3e8a8a3195a, 694ca50e9775; files: src/auto-reply/reply/session-updates.ts, src/auto-reply/reply/agent-runner.ts, src/auto-reply/reply/session-usage.ts)
amknight: Recently changed stale preflight compaction pressure estimation in the same agent-runner-memory guard path affected by this bug. (role: preflight compaction contributor; confidence: medium; commits: bf3b99437816; files: src/auto-reply/reply/agent-runner-memory.ts, src/auto-reply/reply/agent-runner-memory.test.ts)
joshavant: Recently worked on session reset and session metadata normalization adjacent to the session update surface. (role: recent session metadata contributor; confidence: medium; commits: 23f73b3ecfd9; files: src/auto-reply/reply/session-updates.ts, src/config/sessions/store-entry-shape.ts)

Remaining risk / open question:

The fresh-snapshot branch can persist totalTokensFresh=true with no numeric totalTokens when deriveSessionTotalTokens returns undefined.
The PR body does not provide after-fix proof from a patched Feishu/WebChat run; it only includes pre-fix logs plus source/test reasoning.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 1bd10cfee6a4.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/25963445409
Updated: 2026-05-16T13:49:31.717Z

When persistSessionUsageUpdate runs without a fresh context snapshot (hasFreshContextSnapshot=false), it unconditionally set totalTokens=undefined and totalTokensFresh=false. This overwrites the totalTokens that was correctly set by incrementCompactionCount after compaction. The corrupted totalTokens then caused the preflight compaction check to fall through to full transcript token estimation on every message dispatch. For models with smaller context windows (e.g., MiniMax M2.7 at 204K tokens), the re-estimation exceeded the compaction threshold, triggering a new compaction on every single message. Models with larger context windows (e.g., deepseek-v4-flash at 1M tokens) were not affected because the threshold was never exceeded. The fix only updates totalTokens/totalTokensFresh when a fresh context snapshot is available. When the snapshot is stale, the existing values (usually set by the most recent compaction) are preserved.

…tale snapshot When hasFreshContextSnapshot is false, the previous code unconditionally set totalTokens=undefined which corrupted the value set by incrementCompactionCount after compaction. Now totalTokens is preserved (only totalTokensFresh=false is set to mark the snapshot as stale), so the compaction threshold guard does not trigger false positives on every dispatch while the preflight check can still run a full re-estimation when needed.

steipete · 2026-05-16T13:47:32Z

Verification before merge:

Behavior addressed: #82576 repeated preflight compaction loop. The PR now keeps the fresh post-compaction token snapshot fresh when the later stale usage write has no current token count, so the next preflight sees the compacted total instead of compacting again.

Real environment tested: local focused Vitest/lint, codex-review with Testbox check:changed, and GitHub CI on head a3269d0.

Exact steps or command run after this patch:

pnpm test src/auto-reply/reply/session.test.ts src/auto-reply/reply/reply-state.test.ts src/gateway/sessions-patch.test.ts src/gateway/server.sessions.create.test.ts src/cron/service.skips-main-jobs-empty-systemevent-text.test.ts src/cron/service/store.test.ts extensions/github-copilot/index.test.ts extensions/github-copilot/models.test.ts extensions/github-copilot/auth.test.ts -- --reporter=verbose
pnpm test extensions/github-copilot/index.test.ts -- --reporter=verbose
pnpm run lint:tmp:no-raw-channel-fetch
codex-review --parallel-tests "pnpm check:changed" (Testbox tbx_01krrg4h8st5r5krp4twtqf6y6)
GitHub PR checks on a3269d0

Evidence after fix: focused test shards passed; raw channel fetch guard passed; codex-review reported no accepted/actionable findings; Testbox check:changed passed; GitHub PR checks are clean.

Observed result after fix: stale usage updates no longer mark the fresh post-compaction total stale, dead main session rows are recreated without inheriting old label/transcript metadata, GitHub Copilot device-flow fetches use the SSRF guard, and the stale cron expectations now match the current runtime repair/skip behavior.

What was not tested: live provider login against GitHub Copilot; the guarded fetch path is covered with the plugin's auth-refresh test seam.

…fixes openclaw#82900) PR openclaw#82578 ensured totalTokens is preserved across stale usage updates instead of being corrupted to undefined. However, resolveFreshSessionTotalTokens still returns undefined when entry.totalTokensFresh === false, even though the value is valid. After this change: - resolveFreshSessionTotalTokens returns the preserved totalTokens value even when stale, since the value is guaranteed valid. - Internal compaction and memory flush logic check entry.totalTokensFresh directly, not through this function. - isSessionTotalTokensFresh updated to check entry.totalTokensFresh directly instead of relying on resolveFreshSessionTotalTokens. Closes openclaw#82900.

Fixes openclaw#82576. Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations. Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>

openclaw-barnacle Bot added size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 16, 2026

clawsweeper Bot mentioned this pull request May 16, 2026

persistSessionUsageUpdate corrupts totalTokens causing repeated auto-compaction per message #82576

Closed

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 16, 2026

steipete force-pushed the fix/preflight-compaction-totalTokens-corruption branch from 417440b to 96fe4b2 Compare May 16, 2026 13:06

steipete added a commit to njuboy11/openclaw that referenced this pull request May 16, 2026

fix: preserve post-compaction session token freshness (openclaw#82578)

96fe4b2

openclaw-barnacle Bot added size: S and removed size: XS labels May 16, 2026

njuboy11 and others added 6 commits May 16, 2026 14:13

ci: force re-trigger CI with updated PR body

3494378

ci: re-trigger CI with corrected Real behavior proof field format

f9b66b9

fix: preserve post-compaction session token freshness (openclaw#82578)

8dfe291

fix: unblock session and provider PR checks

a3269d0

steipete force-pushed the fix/preflight-compaction-totalTokens-corruption branch from 96fe4b2 to a3269d0 Compare May 16, 2026 13:42

openclaw-barnacle Bot added gateway Gateway runtime extensions: github-copilot size: M and removed size: S labels May 16, 2026

steipete merged commit 6a65ea8 into openclaw:main May 16, 2026
117 of 119 checks passed

giodl73-repo mentioned this pull request May 16, 2026

Revert PR 82472: Fix bundled channel dist-runtime setup roots #82612

Merged

clawsweeper Bot mentioned this pull request May 16, 2026

fix(github-copilot): guard device flow fetches #82588

Closed

github-actions Bot mentioned this pull request May 16, 2026

📡 Upstream Digest — 2026-05-16 14:48 UTC curtismercier/openclaw-mods#872

Open

This was referenced May 17, 2026

fix: return preserved totalTokens from resolveFreshSessionTotalTokens when stale (fixes #82900) #82919

Closed

fix: return preserved totalTokens from resolveFreshSessionTotalTokens when stale (fixes #82900) #82966

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent persistSessionUsageUpdate from corrupting totalTokens (fixes #82576)#82578

fix: prevent persistSessionUsageUpdate from corrupting totalTokens (fixes #82576)#82578
steipete merged 6 commits into
openclaw:mainfrom
njuboy11:fix/preflight-compaction-totalTokens-corruption

njuboy11 commented May 16, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 16, 2026 •

edited

Loading

Uh oh!

steipete commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

njuboy11 commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Real behavior proof

Change

Uh oh!

clawsweeper Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steipete commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njuboy11 commented May 16, 2026 •

edited

Loading

clawsweeper Bot commented May 16, 2026 •

edited

Loading