Skip to content

fix(compaction): count human-linked tool activity as real conversation to prevent false cancellations #40727

@rogerc66

Description

@rogerc66

Bug type

Regression (worked before, now fails)

Summary

In heartbeat/tool-dense sessions, compaction safeguard incorrectly detects "no real conversation" and silently cancels compaction, causing sessions to grow to 200K token context overflow.

Steps to reproduce

  1. Run a Discord thread session with frequent heartbeat polls (every 30 min)
  2. Session accumulates heartbeat HEARTBEAT_OK replies and tool calls over 12+ hours
  3. Session reaches ~150K+ tokens
  4. Observe gateway logs: [compaction-safeguard] Cancelling compaction: no real conversation messages
  5. Session continues growing past 200K token limit
  6. Discord listener stalls (364+ seconds)
  7. Agent goes silent

Expected behavior

Compaction runs successfully when context grows large. Sessions with real human messages AND tool activity should be compacted, not cancelled.

Actual behavior

Compaction is silently cancelled with log: Cancelling compaction: no real conversation messages. Session grows past 200K tokens. Discord listener freezes for 364+ seconds. Agent goes completely silent until manual restart.

OpenClaw version

2026.3.7

Operating system

macOS 15.2 (arm64, Mac Mini M4)

Install method

npm global

Logs, screenshots, and evidence

Gateway log: `[compaction-safeguard] Cancelling compaction: no real conversation messages`
Discord log: `[discord] Slow listener: 364.7 seconds for MESSAGE_CREATE`
Session context: 200000 tokens (ceiling hit)

Root cause traced to `isRealConversationMessage()` in dist/compact-B247y5Qt.js (L13967 + L93405):

function isRealConversationMessage(msg) {
  return msg.role==="user" || msg.role==="assistant" || msg.role==="toolResult";
}

This passes roles without checking content. Heartbeat polls have empty content (stripped), HEARTBEAT_OK replies are boilerplate — both pass role check but represent zero real conversation intent.

Impact and severity

Affected: Any Discord thread session with frequent heartbeats + tool activity running 12+ hours
Severity: High (agent goes completely silent, requires manual restart)
Frequency: Reproducible in long-running thread sessions
Consequence: Agent loses all context, misses user messages, requires manual /new or gateway restart

Additional information

Proposed fix (reviewed and approved by independent reviewer):

Replace naive role check with content-aware detection:

const BOILERPLATE = ['HEARTBEAT_OK', 'NO_REPLY'];

function isMeaningfulText(msg): boolean {
  const raw = (typeof msg.content === 'string' ? msg.content
    : (msg.content ?? []).map(b => b?.text ?? '').join('')).trim();
  if (!raw) return false;
  return !BOILERPLATE.some(t => raw === t || raw.startsWith(t + '\n'));
}

function isRealConversationMessage(msg, window = [], idx?): boolean {
  if (msg.role === 'user' && isMeaningfulText(msg)) return true;
  if (msg.role === 'assistant' && isMeaningfulText(msg)) return true;
  if (msg.role === 'toolResult') {
    const i = idx ?? window.indexOf(msg);
    const prior = window.slice(Math.max(0, i - 20), i);
    if (prior.some(m => m.role === 'user' && isMeaningfulText(m))) return true;
  }
  return false;
}

// Callers: pass index
.some((m, i) => isRealConversationMessage(m, msgs, i))

Files to change: src/agents/pi-extensions/compaction-safeguard.ts + src/agents/pi-embedded-runner/compact.ts

Session incident: 2026-03-09, Discord thread session running since 2026-03-07T07:12Z, hit 200K ceiling.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions