Skip to content

[Bug]: bootstrap/reconcile and hot-cache policy can leave deferred compaction debt stranded #67716

@shiv1213-fin

Description

@shiv1213-fin

Bug type

Behavior bug (incorrect runtime state without crash)

Beta release blocker

No

Summary

Context-engine / LCM maintenance can leave deferred compaction debt stranded in at least two ways:

  1. A bootstrap/reconcile path can import missing tail messages, create deferred compaction debt, and then immediately block debt execution because the host runtime is still in foreground mode.
  2. A hot-cache path can record deferred compaction debt even after rawTokensOutsideTail crosses the configured leaf threshold, but maintenance later skips compaction with reason=hot-cache-budget-headroom / deferred compaction still needed, leaving the debt in place.

These look related to maintenance scheduling and policy, but they may be two distinct bugs.

Expected behavior

  • If bootstrap/reconcile imports messages and pushes rawTokensOutsideTail past the leaf trigger, the runtime should schedule or run a valid background follow-up that is allowed to consume deferred compaction debt.
  • If deferred compaction debt is recorded for a session, maintenance should either pay that debt down promptly or make it clear that policy intentionally superseded the debt.
  • Crossing the leaf trigger should not leave a session in a repeated still needed state without a subsequent successful compaction path.

Actual behavior

Case 1: bootstrap/reconcile creates debt that cannot execute

Observed on April 16, 2026 in a user-facing session:

  • 08:38:12 EDT: deferred compaction worked normally
    • compactLeafAsync start
    • LCM compaction leaf pass (normal): 112071 -> 76263
    • maintain: deferred compaction completed ... reason=compacted
  • 08:41:33 EDT: bootstrap slow-path reconcile imported missing tail messages
    • reconcileSessionTail ... missingTail=12 importedMessages=12
    • then immediately:
      • maintain: deferred compaction debt pending ... but host runtimeContext.allowDeferredCompactionExecution is disabled
      • rawTokensOutsideTail=60243
      • later 60887, 62169, 62438
      • maintain: deferred compaction skipped ... reason=deferred compaction still needed

I traced the current main code and found that background deferral only happens for reason === "turn", while allowDeferredCompactionExecution is only true when executionMode === "background":

  • src/agents/pi-embedded-runner/context-engine-maintenance.ts
    • const executionMode = params.executionMode ?? "foreground";
    • const shouldDefer = params.reason === "turn" && ... turnMaintenanceMode === "background";
    • allowDeferredCompactionExecution: params.executionMode === "background"

That appears to leave bootstrap/reconcile able to create compaction debt in foreground without scheduling the background follow-up needed to consume it.

Case 2: hot-cache policy leaves debt recorded but unpaid

Observed on April 16, 2026 in a background scheduled session:

  • 10:01:44 EDT:
    • rawTokensOutsideTail=40253
    • threshold=40000
    • shouldCompact=false
    • reason=hot-cache-budget-headroom
    • deferred compaction debt recorded
  • 10:01:45 EDT:
    • background turn maintenance was queued
    • maintenance then logged:
      • maintain: deferred compaction skipped ... reason=deferred compaction still needed

So this session crossed the configured leaf trigger, recorded debt, and still did not compact because policy/headroom logic won. That may be intentional, but from an operator perspective it looks like debt can remain sticky even when maintenance wakes up successfully.

Steps to reproduce

I do not yet have a synthetic minimized repro, but the observed sequences were:

Reconcile case

  1. Start a live session with LCM / context-engine maintenance enabled.
  2. Let the session compact normally at least once.
  3. Trigger a slow-path bootstrap/reconcile that imports missing session-tail messages.
  4. Observe rawTokensOutsideTail jump above the leaf trigger.
  5. Observe that maintenance reports deferred compaction debt but also says runtimeContext.allowDeferredCompactionExecution is disabled.

Hot-cache case

  1. Use a session with large history and hot cache state.
  2. Let rawTokensOutsideTail approach and then cross the configured leaf trigger.
  3. Observe debt recorded.
  4. Let background turn maintenance queue and run.
  5. Observe maintenance still skip compaction because hot-cache-budget-headroom / still needed wins.

OpenClaw version

Not yet confirmed from the affected instance. Observed on April 16, 2026 against a Linux npm-global install. The code path on main as of April 16, 2026 still matches the observed behavior.

Operating system

Linux

Install method

npm global / containerized runtime

Model

Multiple sessions; not isolated to one model. The operator-visible symptom is in context-engine / LCM maintenance.

Provider / routing chain

OpenClaw with lossless-claw

Logs and evidence

Representative log excerpts:

2026-04-16 08:38:12 EDT
[lcm] compactLeafAsync start ...
[lcm] LCM compaction leaf pass (normal): 112071 -> 76263 ...
[lcm] maintain: deferred compaction completed ... reason=compacted

2026-04-16 08:41:33 EDT
[lcm] reconcileSessionTail: slow path ... missingTail=12 importedMessages=12
[lcm] maintain: deferred compaction debt pending ... but host runtimeContext.allowDeferredCompactionExecution is disabled
[lcm] incremental compaction decision: ... rawTokensOutsideTail=60243 ... reason=hot-cache-budget-headroom
[lcm] maintain: deferred compaction skipped ... reason=deferred compaction still needed

2026-04-16 10:01:44 EDT
[lcm] incremental compaction decision: ... rawTokensOutsideTail=40253 threshold=40000 shouldCompact=false ... reason=hot-cache-budget-headroom
[lcm] deferred compaction debt recorded ...

2026-04-16 10:01:45 EDT
[context-engine] deferred turn maintenance queued ...
[lcm] maintain: deferred compaction skipped ... reason=deferred compaction still needed

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions