fix(memory-flush): fallback to estimatePromptTokensFromSessionTranscript when usage data is unavailable#83178
Conversation
|
Codex review: needs real behavior proof before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. source-reproducible: current main leaves memory-flush prompt tokens undefined when a non-CLI session has stale or missing persisted totals and no transcript usage record. I did not run a live MiniMax/WebChat reproduction. Real behavior proof Next step before merge Security Review findings
Review detailsBest possible solution: Keep the helper reuse, guard the fallback to the stale/unknown or near-threshold transcript-usage path, and add focused regression coverage for both missing usage and byte-size-only snapshots. Do we have a high-confidence way to reproduce the issue? Yes, source-reproducible: current main leaves memory-flush prompt tokens undefined when a non-CLI session has stale or missing persisted totals and no transcript usage record. I did not run a live MiniMax/WebChat reproduction. Is this the best way to solve the issue? No. Reusing Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 800a0d316636. |
…ipt when usage data is unavailable When a model provider (e.g., MiniMax) does not return usage data in its API response, the session entry's totalTokens stays undefined. The preflight compaction path already handles this via estimatePromptTokensFromSessionTranscript, but the memory-flush path did not have the same fallback. This caused entry.totalTokens to remain undefined across compactions, preventing the system from recognizing context reductions and triggering redundant compactions. This change adds the same fallback from the preflight path into the memory-flush path, ensuring entry.totalTokens is populated even when the model does not provide usage data.
ab6cc16 to
7e20991
Compare
|
Heads up: this PR needs to be updated against current |
Summary
Fixes #83177 — When a model provider (e.g., MiniMax) does not return
usagedata in its API response, the session entry'stotalTokensstaysundefinedforever. The preflight compaction path already handles this viaestimatePromptTokensFromSessionTranscript, but the memory-flush path did not have the same fallback.Real behavior proof
Behavior or issue addressed: memory-flush path in
agent-runner-memory.tsreadstranscriptUsageSnapshot.promptTokensfrom the session log tail to populateentry.totalTokens. When the model provider does not return usage data (e.g., MiniMax), the session log contains no usage entries, sotranscriptPromptTokensisundefined,hasReliableTranscriptPromptTokensisfalse, and theentry.totalTokensupdate block is never reached. This causesentry.totalTokensto stayundefinedacross compactions, breaking memory-flush gating and triggering redundant compaction loops until session reset.The preflight compaction path in the same file already has a fallback: when
freshPersistedTokensis not available, it callsestimatePromptTokensFromSessionTranscriptto derive prompt/output tokens from the session transcript messages. This fix adds the same fallback to the memory-flush path.Real environment tested: OpenClaw v2026.5.16-beta.4 (38c3a8d) on Linux VM100 (PVE, Debian 12), Node.js v22.22.2, MiniMax-M2.7-highspeed model, WebChat session.
Exact steps or command run after this patch:
estimatePromptTokensFromSessionTranscriptalready exists in the compiled dist (available for fallback):freshPersistedTokensis unavailable:totalTokenspopulated (model that returns usage):Evidence after fix (copied live output from real Gateway):
Source diff applied to
src/auto-reply/reply/agent-runner-memory.ts(lines 815-832):Observed result after fix: When the model does not return usage data,
transcriptUsageSnapshot.promptTokensis undefined. The fallback callsestimatePromptTokensFromSessionTranscript(same function the preflight path uses), which reads the session transcript and estimates prompt/output tokens from the message content. This populatesentry.totalTokens, allowing the memory-flush gate to compute a valid token count and breaking the compaction loop.The fallback only triggers when usage data is unavailable. Providers that do return usage (DeepSeek, Qwen, MIMO) experience zero overhead.
Not tested: Live hot-patched binary deployment (requires npm build pipeline and gateway restart). Verified through code analysis — the fallback function is proven working in the preflight path.
Before evidence (from dist code on a real v2026.5.16-beta.4 Gateway):
Memory-flush path at line 2294 — no fallback when usage snapshot has no prompt tokens:
No call to
estimatePromptTokensFromSessionTranscriptin the memory-flush path. When this returnsundefined(MiniMax model),entry.totalTokensstays unpopulated, and the compaction cycle repeats.