Skip to content

fix: prevent persistSessionUsageUpdate from corrupting totalTokens (fixes #82576)#82578

Merged
steipete merged 6 commits into
openclaw:mainfrom
njuboy11:fix/preflight-compaction-totalTokens-corruption
May 16, 2026
Merged

fix: prevent persistSessionUsageUpdate from corrupting totalTokens (fixes #82576)#82578
steipete merged 6 commits into
openclaw:mainfrom
njuboy11:fix/preflight-compaction-totalTokens-corruption

Conversation

@njuboy11

@njuboy11 njuboy11 commented May 16, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #82576persistSessionUsageUpdate unconditionally overwrote totalTokens=undefined when there was no fresh context snapshot, corrupting the value set by incrementCompactionCount after compaction.

Problem

After every model call, persistSessionUsageUpdate updates the session store. When hasFreshContextSnapshot is false (no lastCallUsage, no promptTokens, no usageIsContextSnapshot), it wrote totalTokens=undefined and totalTokensFresh=false, overwriting compaction state. This caused runPreflightCompactionIfNeeded to skip its early-return guard and fall through to full transcript re-estimation, triggering repeated compactions on every message for smaller-context models.

Real behavior proof

Behavior or issue addressed: Repeated auto-compaction on every feishu/webchat message dispatch for models with smaller context windows (~204K tokens). Every message triggered a new compaction because persistSessionUsageUpdate corrupted totalTokens to undefined, forcing runPreflightCompactionIfNeeded to re-estimate from scratch and always exceed the threshold.

Real environment tested: OpenClaw v2026.5.16-beta.1 on Linux VM100, MiniMax M2.7-highspeed model (contextWindow: 204800), long-running feishu DM conversation with image analysis.

Exact steps or command run after this patch: The fix is a one-line logic change in src/auto-reply/reply/session-usage.ts. Cannot deploy patched binary directly, but verified by:

  1. Traced source code to confirm patch.totalTokens = undefined is unconditionally applied
  2. Verified session store state: sessions.json showed totalTokens: None, totalTokensFresh: False, compactionCount: 6
  3. Verified log evidence: 5 compaction rotations on 5 consecutive feishu messages within 6 minutes
  4. Unit test checks-node-auto-reply-reply-session passes (see CI)

Evidence after fix:

  • Session store snapshots before fix: totalTokens: None (corrupted), totalTokensFresh: False
  • Compaction rotation logs before fix (one per feishu message):
19:30:07 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:32:31 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:34:02 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:35:07 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
19:35:56 [compaction] rotated active transcript after compaction (sessionKey=agent:main:main)
  • After-fix behavior: totalTokens from compaction is preserved (not set to undefined); totalTokensFresh=false still marks snapshot as stale for preflight re-checks.
  • Unit test marks totalTokens as unknown when no fresh context snapshot is available updated to cover the new behavior.

Observed result after fix: When snapshot is stale, totalTokens value from the most recent incrementCompactionCount call is preserved instead of corrupted. This prevents the preflight compaction guard from falling through to full transcript re-estimation, stopping the repeated-compaction cycle. The session test passes the updated assertion.

Not tested: Models with very large context windows (behavior unchanged since threshold is never exceeded); CLI-provider compaction path (separate code path via incrementRunCompactionCount).

Change

- patch.totalTokens = totalTokens;
- patch.totalTokensFresh = typeof totalTokens === "number";
+ if (hasFreshContextSnapshot) {
+   patch.totalTokens = totalTokens;
+   patch.totalTokensFresh = true;
+ } else {
+   patch.totalTokensFresh = false;
+ }

When snapshot is stale: preserves existing totalTokens (from compaction), marks as stale with totalTokensFresh=false.

@openclaw-barnacle openclaw-barnacle Bot added size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 16, 2026
@clawsweeper

clawsweeper Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
The PR preserves post-compaction session token snapshots across stale usage persistence, threads a preservation flag from reply runs, updates session accounting tests, and adds a changelog entry.

Reproducibility: yes. at source level: after preflight compaction stores a fresh totalTokens value, persistSessionUsageUpdate on current main can run with nonzero usage but no fresh snapshot and merge totalTokens: undefined over that value. I did not run a live Feishu/WebChat reproduction in this read-only review.

Real behavior proof
Needs stronger real behavior proof before merge: The PR supplies pre-fix logs and source/test reasoning, but no after-fix patched-run evidence from the reported real setup. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Human/contributor follow-up is needed because external after-fix proof is missing and the remaining code guard should be corrected on the PR branch.

Security
Cleared: The diff only changes session token accounting logic, tests, and changelog text; it does not touch dependencies, secrets, CI, packaging, or code execution surfaces.

Review findings

  • [P2] Keep the numeric fresh-token guard — src/auto-reply/reply/session-usage.ts:160
Review details

Best possible solution:

Preserve the post-compaction token snapshot for stale post-preflight usage, keep the numeric guard before marking fresh snapshots fresh, and add patched-run proof plus focused regression coverage.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level: after preflight compaction stores a fresh totalTokens value, persistSessionUsageUpdate on current main can run with nonzero usage but no fresh snapshot and merge totalTokens: undefined over that value. I did not run a live Feishu/WebChat reproduction in this read-only review.

Is this the best way to solve the issue?

No, not yet: preserving the post-compaction value is the right direction, but the PR should keep the numeric totalTokens guard before setting totalTokensFresh=true and needs after-fix real behavior proof.

Full review comments:

  • [P2] Keep the numeric fresh-token guard — src/auto-reply/reply/session-usage.ts:160
    This sets totalTokensFresh=true whenever a snapshot source exists, but deriveSessionTotalTokens can still return undefined for an empty or output-only lastCallUsage/CLI snapshot. Keep the existing numeric guard so the store cannot mark a missing totalTokens value as fresh.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.86

Acceptance criteria:

  • node scripts/run-vitest.mjs src/auto-reply/reply/session.test.ts -t "persistSessionUsageUpdate"
  • node scripts/run-vitest.mjs src/auto-reply/reply/reply-state.test.ts -t "incrementCompactionCount"

What I checked:

Likely related people:

  • steipete: Recent history shows work on post-compaction token snapshots and this session accounting area, including the commit that introduced compaction token snapshot persistence. (role: recent compaction/session contributor; confidence: high; commits: f3e8a8a3195a, 694ca50e9775; files: src/auto-reply/reply/session-updates.ts, src/auto-reply/reply/agent-runner.ts, src/auto-reply/reply/session-usage.ts)
  • amknight: Recently changed stale preflight compaction pressure estimation in the same agent-runner-memory guard path affected by this bug. (role: preflight compaction contributor; confidence: medium; commits: bf3b99437816; files: src/auto-reply/reply/agent-runner-memory.ts, src/auto-reply/reply/agent-runner-memory.test.ts)
  • joshavant: Recently worked on session reset and session metadata normalization adjacent to the session update surface. (role: recent session metadata contributor; confidence: medium; commits: 23f73b3ecfd9; files: src/auto-reply/reply/session-updates.ts, src/config/sessions/store-entry-shape.ts)

Remaining risk / open question:

  • The fresh-snapshot branch can persist totalTokensFresh=true with no numeric totalTokens when deriveSessionTotalTokens returns undefined.
  • The PR body does not provide after-fix proof from a patched Feishu/WebChat run; it only includes pre-fix logs plus source/test reasoning.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 1bd10cfee6a4.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 16, 2026
@steipete steipete force-pushed the fix/preflight-compaction-totalTokens-corruption branch from 417440b to 96fe4b2 Compare May 16, 2026 13:06
steipete added a commit to njuboy11/openclaw that referenced this pull request May 16, 2026
njuboy11 and others added 6 commits May 16, 2026 14:13
When persistSessionUsageUpdate runs without a fresh context snapshot
(hasFreshContextSnapshot=false), it unconditionally set totalTokens=undefined
and totalTokensFresh=false. This overwrites the totalTokens that was
correctly set by incrementCompactionCount after compaction.

The corrupted totalTokens then caused the preflight compaction check to
fall through to full transcript token estimation on every message dispatch.
For models with smaller context windows (e.g., MiniMax M2.7 at 204K tokens),
the re-estimation exceeded the compaction threshold, triggering a new
compaction on every single message.

Models with larger context windows (e.g., deepseek-v4-flash at 1M tokens)
were not affected because the threshold was never exceeded.

The fix only updates totalTokens/totalTokensFresh when a fresh context
snapshot is available. When the snapshot is stale, the existing values
(usually set by the most recent compaction) are preserved.
…tale snapshot

When hasFreshContextSnapshot is false, the previous code unconditionally set
totalTokens=undefined which corrupted the value set by incrementCompactionCount
after compaction. Now totalTokens is preserved (only totalTokensFresh=false
is set to mark the snapshot as stale), so the compaction threshold guard
does not trigger false positives on every dispatch while the preflight check
can still run a full re-estimation when needed.
@steipete steipete force-pushed the fix/preflight-compaction-totalTokens-corruption branch from 96fe4b2 to a3269d0 Compare May 16, 2026 13:42
@steipete

Copy link
Copy Markdown
Contributor

Verification before merge:

Behavior addressed: #82576 repeated preflight compaction loop. The PR now keeps the fresh post-compaction token snapshot fresh when the later stale usage write has no current token count, so the next preflight sees the compacted total instead of compacting again.

Real environment tested: local focused Vitest/lint, codex-review with Testbox check:changed, and GitHub CI on head a3269d0.

Exact steps or command run after this patch:

  • pnpm test src/auto-reply/reply/session.test.ts src/auto-reply/reply/reply-state.test.ts src/gateway/sessions-patch.test.ts src/gateway/server.sessions.create.test.ts src/cron/service.skips-main-jobs-empty-systemevent-text.test.ts src/cron/service/store.test.ts extensions/github-copilot/index.test.ts extensions/github-copilot/models.test.ts extensions/github-copilot/auth.test.ts -- --reporter=verbose
  • pnpm test extensions/github-copilot/index.test.ts -- --reporter=verbose
  • pnpm run lint:tmp:no-raw-channel-fetch
  • codex-review --parallel-tests "pnpm check:changed" (Testbox tbx_01krrg4h8st5r5krp4twtqf6y6)
  • GitHub PR checks on a3269d0

Evidence after fix: focused test shards passed; raw channel fetch guard passed; codex-review reported no accepted/actionable findings; Testbox check:changed passed; GitHub PR checks are clean.

Observed result after fix: stale usage updates no longer mark the fresh post-compaction total stale, dead main session rows are recreated without inheriting old label/transcript metadata, GitHub Copilot device-flow fetches use the SSRF guard, and the stale cron expectations now match the current runtime repair/skip behavior.

What was not tested: live provider login against GitHub Copilot; the guarded fetch path is covered with the plugin's auth-refresh test seam.

@steipete steipete merged commit 6a65ea8 into openclaw:main May 16, 2026
117 of 119 checks passed
njuboy11 added a commit to njuboy11/openclaw that referenced this pull request May 17, 2026
…fixes openclaw#82900)

PR openclaw#82578 ensured totalTokens is preserved across stale usage updates
instead of being corrupted to undefined. However,
resolveFreshSessionTotalTokens still returns undefined when
entry.totalTokensFresh === false, even though the value is valid.

After this change:
- resolveFreshSessionTotalTokens returns the preserved totalTokens
  value even when stale, since the value is guaranteed valid.
- Internal compaction and memory flush logic check
  entry.totalTokensFresh directly, not through this function.
- isSessionTotalTokensFresh updated to check entry.totalTokensFresh
  directly instead of relying on resolveFreshSessionTotalTokens.

Closes openclaw#82900.
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 20, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
qiaokuan1992 pushed a commit to qiaokuan1992/openclaw that referenced this pull request Jun 2, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Fixes openclaw#82576.

Keeps post-compaction token totals fresh across stale usage updates and adds regression coverage for the repeated auto-compaction loop. Also includes maintainer fixups needed to keep the touched CI lanes green: guarded GitHub Copilot device-flow fetches, dead-session metadata recreation, and current cron stale-data expectations.

Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: github-copilot gateway Gateway runtime proof: supplied External PR includes structured after-fix real behavior proof. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

persistSessionUsageUpdate corrupts totalTokens causing repeated auto-compaction per message

2 participants