Skip to content

[Bug]: repeated early preflight compactions after compaction due to stale transcript usage #81178

@blortski

Description

@blortski

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

After a large tool-output or transcript spike, OpenClaw can repeatedly compact the same session even after a successful compaction has reduced active context well below the model limit. The preflight compaction estimator appears to read stale pre-compaction usage from the transcript file and treat it as current active context, causing repeated early compactions.

Steps to reproduce

Environment
OpenClaw version: 2026.5.7 (eeef486)
OS: macOS / Darwin 25.3.0 arm64
Runtime: Pi Default
Channel/session type: Discord channel session
Primary model: openai-codex/gpt-5.5
Fallback: claude-cli/claude-opus-4-7
Compaction mode: safeguard

Configuration notes
agents.defaults.compaction.mode = safeguard
agents.defaults.compaction.memoryFlush.enabled = true
agents.defaults.compaction.truncateAfterCompaction was not configured / absent
agents.defaults.compaction.maxActiveTranscriptBytes was not configured / absent

Expected behavior

After a successful compaction, preflight compaction should base its next decision on the active post-compaction replay/context, not stale pre-compaction usage records still present in the transcript file. A session at roughly 60k-90k / 272k should not immediately compact again unless an explicit transcript-size policy is configured and exceeded.

Actual behavior

  1. A large tool result inflated the transcript/context.
  2. OpenClaw compacted.
  3. Active context after compaction was far below the model window, roughly 60k-90k of 272k.
  4. Despite that, later turns still triggered unexpected early compactions.
  5. Session compaction count increased repeatedly even though active context was not near the model limit.

OpenClaw version

2026.5.7 (eeef486)

Operating system

macOS / Darwin 25.3.0 arm64

Install method

npm global via Homebrew-managed Node environment

Model

gpt5.5-codex

Provider / routing chain

openclaw --> openai-codex

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

Why this matters:
This can create a compaction loop: one large tool-result spike poisons the transcript usage estimate, compaction runs, but subsequent turns still see stale pre-compaction usage and compact again. It burns context, disrupts continuity, and can make long-running sessions feel unstable even after compaction succeeds.

Additional information

Suspected root cause
In the preflight compaction path, estimatePromptTokensFromSessionTranscript() reads the last nonzero usage record from the transcript. When transcripts are not rotated after compaction, that usage record can belong to the pre-compaction giant prompt. The next preflight check then treats that stale high usage as current active context and triggers another compaction.

Relevant code areas
agent-runner.runtime: runPreflightCompactionIfNeeded()
agent-runner.runtime: estimatePromptTokensFromSessionTranscript()
compaction successor transcript handling / transcript rotation after compaction

Local mitigation that stopped the loop
Bound stale transcript usage against the recent replay estimate. If usagePromptTokens is much larger than estimatedMessageTokens, use the recent replay estimate instead of the stale transcript usage. Also avoid falling back from missing recent estimate to full transcript byte-size token estimate unless an explicit transcript-size compaction policy is configured.

Pseudo-patch
Inside estimatePromptTokensFromSessionTranscript(), replace raw usagePromptTokens with a bounded value:

const boundedUsagePromptTokens =
typeof estimatedMessageTokens === "number" &&
usagePromptTokens > estimatedMessageTokens * 2 + 10000
? estimatedMessageTokens
: usagePromptTokens;

Then use boundedUsagePromptTokens instead of raw usagePromptTokens for the preflight estimate.

Related issues to reference
#73003 — Safeguard compaction / Pi SDK compaction conflict with repeated compactions
#65600 — preflight compaction token-state bug around totalTokensFresh
#11468 — gateway config.get returns full resolved config into session context
#19137 — gateway config.get leaks config into session transcript

Additional related bug from same incident
The assistant-facing gateway config.get tool accepted a path parameter but ignored it in 2026.5.7, returning the full resolved config snapshot even for narrow reads. That was one trigger for the giant context spike. This may deserve a separate linked issue if not covered by #11468/#19137.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions