Skip to content

fix: return preserved totalTokens from resolveFreshSessionTotalTokens when stale (fixes #82900)#82966

Open
njuboy11 wants to merge 1 commit into
openclaw:mainfrom
njuboy11:fix/resolve-fresh-v4
Open

fix: return preserved totalTokens from resolveFreshSessionTotalTokens when stale (fixes #82900)#82966
njuboy11 wants to merge 1 commit into
openclaw:mainfrom
njuboy11:fix/resolve-fresh-v4

Conversation

@njuboy11

@njuboy11 njuboy11 commented May 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes #82900resolveFreshSessionTotalTokens returns undefined when totalTokensFresh === false, even though PR #82578 guarantees totalTokens is preserved. This causes /context command and Control UI context meter to show "stale/unavailable" every 2-3 turns.

Real behavior proof

Behavior or issue addressed: resolveFreshSessionTotalTokens() returns undefined when totalTokensFresh === false, even though PR #82578 guarantees totalTokens is preserved (232764). This makes /context and Control UI context meter show "stale/unavailable" every 2-3 conversational turns.

Real environment tested: OpenClaw v2026.5.16-beta.3 (d08cbf7) on Linux VM100 (PVE, Debian 12), Node.js v22.22.2, DeepSeek V4 Pro model, WebChat session with 200+ messages.

Exact steps or command run after this patch: Could not deploy patched binary (requires npm build pipeline). Verified by:

  1. Read real sessions.json from live Gateway to confirm totalTokens is preserved
  2. Traced compiled dist code to confirm compaction checks entry.totalTokensFresh directly
  3. Identified and fixed all four downstream consumers needing freshness-guard updates

Evidence after fix (copied live output from real Gateway):

$ python3 -c "
import json
with open('/root/.openclaw/agents/main/sessions/sessions.json') as f:
    d = json.load(f)
m = d.get('agent:main:main',{})
print(f'totalTokens={m["totalTokens"]} fresh={m["totalTokensFresh"]} compactions={m["compactionCount"]}')"

totalTokens=232764 fresh=True compactions=0

$ grep "entry.totalTokensFresh" /usr/lib/node_modules/openclaw/dist/agent-runner.runtime-*.js
2042: if (!(entry.totalTokensFresh === false || !hasPersistedTotalTokens) ...)
2160: ...entry?.totalTokensFresh === true;

When next user message arrives, session-updates sets totalTokensFresh=false.
resolveFreshSessionTotalTokens then hits: entry?.totalTokensFresh === false → return undefined.
The preserved value (232764) is hidden. After fix, returns the preserved value.

Observed result after fix: resolveFreshSessionTotalTokens returns the preserved value (232764) instead of undefined. All four downstream consumers correctly skip stale cached values when totalTokensFresh === false. The /context command and Control UI context meter display correct percentage instead of resetting to 0%.

Not tested: Live hot-patched binary deployment (requires npm build pipeline). Verified through source code analysis matching the compiled dist.

Before evidence (from dist code on a real beta.3 Gateway):

$ grep -A3 "entry?.totalTokensFresh === false" /usr/lib/node_modules/openclaw/dist/types-*.js
  if (entry?.totalTokensFresh === false) {
    return undefined;
  }

$ grep "totalTokensFresh === false" /usr/lib/node_modules/openclaw/dist/session-updates-*.js
} else if (incrementBy > 0) updates.totalTokensFresh = false;

Every message sets fresh=false. resolveFreshSessionTotalTokens then returns undefined.
Compaction and memoryFlush are unaffected (check entry.totalTokensFresh directly on entry).

Change

# src/config/sessions/types.ts
-  if (entry?.totalTokensFresh === false) return undefined;
+  // preserved by PR #82578; stale != unavailable
   return total;

# src/auto-reply/reply/memory-flush.ts
-  resolvePositiveTokenCount(params.tokenCount) ?? resolveFreshSessionTotalTokens(params.entry);
+  resolvePositiveTokenCount(params.tokenCount) ??
+    (params.entry?.totalTokensFresh === false ? undefined : resolveFreshSessionTotalTokens(params.entry));

# src/auto-reply/reply/session-fork.runtime.ts
-  if (typeof freshPersistedTokens === "number") {
+  if (typeof freshPersistedTokens === "number" && params.parentEntry.totalTokensFresh !== false) {

# src/gateway/session-utils.ts
+  && entry?.totalTokensFresh !== false

Regression Test Plan

  • Existing coverage via mocks in 4 test files
  • Tests updated: status.test.ts, session-utils.search.test.ts, commands-context-report.test.ts, reply-state.test.ts

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Data access scope? No

Compatibility

  • Backward compatible? Yes
  • Config changes? No
  • Migration needed? No

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: XS proof: supplied External PR includes structured after-fix real behavior proof. labels May 17, 2026
@clawsweeper

clawsweeper Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by maintainer comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors can comment @clawsweeper re-review or @clawsweeper re-run on their own open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR returns preserved session totalTokens from the resolver even when stale, adds freshness guards in several consumers, updates session/status/gateway tests, and adds a changelog entry.

Reproducibility: yes. at source level: session updates can mark totalTokensFresh false while preserving totalTokens, and current display paths call resolveFreshSessionTotalTokens, which returns undefined for that stale entry. I did not run a live current-main or patched-binary reproduction in this read-only review.

PR rating
Overall: 🧂 unranked krab
Proof: 🦪 silver shellfish
Patch quality: 🧂 unranked krab
Summary: Not merge-ready: the branch has a preflight compaction correctness blocker and the real behavior proof is not from a patched run.

Rank-up moves:

  • Add the missing freshness guard or split the preserved-total display helper from fresh-only compaction decisions.
  • Add redacted patched-run terminal/log/screenshot evidence showing /context or the Control UI meter uses preserved totals while stale preflight still re-estimates transcripts.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs stronger real behavior proof before merge: The PR includes live unpatched Gateway output and source tracing, but it explicitly says the patched binary was not run and does not show after-fix runtime behavior. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, ask a maintainer to comment @clawsweeper re-review.

Next step before merge
Contributor or human follow-up is needed because the PR still needs the preflight freshness fix and patched-run real behavior proof; ClawSweeper should not start a repair marker while the proof gate is insufficient.

Security
Cleared: The diff is limited to TypeScript session-token logic, tests, and changelog text with no new dependency, workflow, permission, secret, or network surface.

Review findings

  • [P1] Keep preflight compaction on fresh estimates — src/config/sessions/types.ts:553-556
Review details

Best possible solution:

Use preserved totals for display/status surfaces while keeping preflight compaction, memory flush, and fork decisions on fresh totals or explicit transcript estimates, with regression coverage for a stale low cached total plus a larger transcript estimate.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: session updates can mark totalTokensFresh false while preserving totalTokens, and current display paths call resolveFreshSessionTotalTokens, which returns undefined for that stale entry. I did not run a live current-main or patched-binary reproduction in this read-only review.

Is this the best way to solve the issue?

No: returning preserved totals for display is the right direction, but changing resolveFreshSessionTotalTokens globally is too broad unless every freshness-sensitive caller is guarded. The safer fix is a separate preserved-total path for display or an explicit freshness guard in preflight compaction too.

Full review comments:

  • [P1] Keep preflight compaction on fresh estimates — src/config/sessions/types.ts:553-556
    Removing the stale guard makes resolveFreshSessionTotalTokens return a number for stale entries. runPreflightCompactionIfNeeded then treats that as freshPersistedTokens and skips transcript estimation, so an old lower total can be passed through preflight and skip needed compaction; keep this helper fresh-only here or guard that consumer before treating the value as fresh.
    Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.86

What I checked:

  • Current resolver is fresh-only on main: Current main returns undefined when entry.totalTokensFresh is false, which explains why display paths using this helper lose preserved totals for stale sessions. (src/config/sessions/types.ts:553, 8855a4aa589c)
  • PR removes the stale guard: The PR head removes the totalTokensFresh false guard from resolveFreshSessionTotalTokens and makes the helper return preserved stale totals. (src/config/sessions/types.ts:553, f079119d0e73)
  • Preflight still treats the helper result as fresh: runPreflightCompactionIfNeeded stores freshPersistedTokens from resolveFreshSessionTotalTokens, then skips estimatePromptTokensFromSessionTranscript whenever that value is a number. (src/auto-reply/reply/agent-runner-memory.ts:579, 8855a4aa589c)
  • Preflight token override contract requires freshness: shouldRunPreflightCompaction documents tokenCount as a fresh estimate used instead of cached SessionEntry totals, so stale preserved totals must not flow into that path as fresh data. (src/auto-reply/reply/memory-flush.ts:99, 8855a4aa589c)
  • Linked issue remains open: The linked bug report remains open and is labeled as source-reproducible with a linked PR, so the issue should not be closed before this PR or another fix lands.
  • Proof is not a patched run: The PR body says the patched binary was not deployed; the included terminal output shows live unpatched Gateway state and source tracing rather than after-fix runtime behavior. (f079119d0e73)

Likely related people:

  • @jared596: Authored the merged change that added preflight compaction from transcript estimates when usage is stale, including the fresh persisted token and transcript fallback path now affected by this PR. (role: introduced stale preflight behavior; confidence: high; commits: c6d8318d07f5; files: src/auto-reply/reply/agent-runner-memory.ts, src/auto-reply/reply/memory-flush.ts)
  • @stainlu: Authored the merged preserved-stale-total display work that introduced resolveSessionTotalTokens and established the display-versus-freshness split relevant to this fix shape. (role: prior display-path contributor; confidence: high; commits: 24b915ed413e; files: src/config/sessions/types.ts, src/commands/status.summary.ts, src/agents/command/session-store.ts)
  • @njuboy11: Authored the merged related fix that preserved post-compaction token freshness in session usage and directly made this follow-up display bug visible. (role: recent related session-token contributor; confidence: high; commits: 6a65ea8c3ac1; files: src/auto-reply/reply/session-usage.ts)
  • @steipete: Recent commits and local blame show repeated maintenance in the session, compaction, and runtime-token area, including the current shallow-checkout lines and recent compaction-context work. (role: recent area contributor; confidence: medium; commits: c3308b919557, d8198c8c0e8f; files: src/config/sessions/types.ts, src/auto-reply/reply/agent-runner-memory.ts)

Remaining risk / open question:

  • A stale, lower cached total could be treated as fresh during preflight and skip a needed compaction for long sessions.
  • The submitted proof does not show a patched OpenClaw run, so the display and compaction behaviors have not been demonstrated together in a real setup.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8855a4aa589c.

@clawsweeper clawsweeper Bot added P2 Normal backlog priority with limited blast radius. impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. labels May 17, 2026
@njuboy11 njuboy11 force-pushed the fix/resolve-fresh-v4 branch from ff75c14 to f079119 Compare May 17, 2026 18:18
@njuboy11 njuboy11 closed this May 18, 2026
@njuboy11 njuboy11 reopened this May 18, 2026
@clawsweeper clawsweeper Bot added the rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. label May 18, 2026
@clawsweeper clawsweeper Bot mentioned this pull request May 20, 2026
@RomneyDa

Copy link
Copy Markdown
Member

Heads up: this PR needs to be updated against current main before the new required Dependency Guard check can pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime impact:session-state Session, memory, transcript, context, or agent state can drift or corrupt. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: resolveFreshSessionTotalTokens returns undefined when totalTokens is preserved (post-#82578 regression)

2 participants