Skip to content

Proactive compaction scheduler never re-fires after first checkpoint (compactionCount latched at 1) #63892

@070freebird070-ctrl

Description

@070freebird070-ctrl

Bug: Proactive compaction scheduler never re-fires after first checkpoint

Summary

After a successful compaction, the proactive compaction scheduler never triggers again for the remainder of the session. Only an overflow-retry can force a second compaction. This causes sessions to grow from ~20k tokens back to 150k+ with zero proactive intervention.

Environment

  • OpenClaw v2026.4.9
  • Model: ollama/glm-5.1:cloud (200k context)
  • compaction.mode: "default"
  • compaction.reserveTokensFloor: 80000

Reproduction

  1. Start a session that triggers compaction (either proactive or overflow-retry)
  2. After compaction succeeds (e.g., 137k→20k tokens), continue working
  3. Session grows back past the reserveTokensFloor threshold (200k - 80k = 120k)
  4. Expected: Proactive compaction fires again at ~120k tokens
  5. Actual: No proactive compaction ever fires. Session grows to 157k/200k with zero re-triggers.

Evidence from session metadata (sessions.json)

compactionCount: 1
compactionCheckpoints: [
  { checkpointId: "...", reason: "overflow-retry", tokensBefore: 137324, tokensAfter: 19985, createdAt: 1775743884194 },
  { checkpointId: "...", reason: "overflow-retry", tokensBefore: 160842, tokensAfter: 22198, createdAt: 1775757453533 }
]

Key observation: compactionCount is 1 despite 2 checkpoints in the array. The overflow-retry path creates checkpoints but does not appear to increment compactionCount correctly.

Root Cause Hypothesis

The proactive compaction scheduler checks compactionCount > 0 (or similar latch) to decide whether compaction has already been performed this session. After the first compaction sets compactionCount = 1, the scheduler considers itself "done" and never re-evaluates — even as the session grows well past the threshold again.

compactionCount should be a monotonic trigger counter that fires on each threshold crossing, not a one-shot latch capped at 1.

Suggested Fix

  1. Replace the latch check with a threshold-crossing check: fire proactive compaction whenever currentTokens > (maxTokens - reserveTokensFloor) AND no compaction has occurred since the last threshold crossing.
  2. Track lastCompactionAtTokenCount instead of (or in addition to) compactionCount, so the scheduler can detect that the session has grown past the threshold again after a successful compaction.
  3. Ensure overflow-retry compactions also increment compactionCount to keep the counter consistent with the checkpoint array.

Impact

Without this fix, long-running sessions silently grow to context overflow, causing degraded performance and eventual errors — even when reserveTokensFloor is configured to prevent exactly this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions