Skip to content

[Bug]: Telegram async exec followup leaks HEARTBEAT_OK/internal text after context overflow #74257

@Staruy

Description

@Staruy

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

In OpenClaw 2026.4.26, a Telegram topic session that hit context overflow during an approved exec/tool flow leaked heartbeat/internal async-completion text into the user-visible chat and then produced expired approval/resume errors.

Steps to reproduce

NOT_ENOUGH_INFO

Expected behavior

Internal async completion / heartbeat-only followups should be consumed by the runtime and should not be delivered as normal user-visible Telegram messages. Context overflow auto-compaction should preserve the in-flight turn state without reclassifying internal followup prompts as normal user input.

Actual behavior

A Telegram topic session entered context-overflow recovery during a tool/exec flow, then OpenClaw logged long session locks, expired Telegram approval callbacks, expired approval IDs, lane waits, and stuck-session diagnostics. The user-visible chat received internal/no-op content such as HEARTBEAT_OK and async completion/followup narration that was intended to be handled silently.

OpenClaw version

2026.4.26 (build be8c246)

Operating system

Ubuntu 24.04.4 LTS

Install method

npm global

Model

openai-codex/gpt-5.5

Provider / routing chain

OpenClaw gateway -> openai-codex -> gpt-5.5

Additional provider/model setup details

Affected surface was a Telegram topic/thread session for a topic-specific agent. Identifiers below are redacted; no chat IDs, session UUIDs, hostnames, IP addresses, raw prompts, tokens, or user names are included.

Logs, screenshots, and evidence

# Current state after the incident
gateway service: active/running
gateway RPC: ok
active tasks: 0

# Observed incident window, local time redacted to minute precision
[context-overflow-diag] sessionKey=<redacted-telegram-topic-session> provider=openai-codex/gpt-5.5 source=assistantError messages=24 compactionAttempts=0 observedTokens=unknown error=Context overflow: estimated context size exceeds safe threshold during tool loop.
context overflow detected (attempt 1/3); attempting auto-compaction

[session-write-lock] releasing lock held for 26601ms (max=15000ms): <openclaw-agent-sessions>/sessions.json.lock
Telegram callback answer failed: 400 Bad Request: query is too old and response timeout expired or query ID is invalid
exec.approval.waitDecision 37149ms
exec.approval.resolve failed: unknown or expired approval id
plugin.approval.resolve failed: unknown or expired approval id
callback handler failed: TelegramRetryableCallbackError: GatewayClientRequestError: unknown or expired approval id

[compaction] rotated active transcript after compaction (sessionKey=<redacted-telegram-topic-session>)
auto-compaction succeeded for openai-codex/gpt-5.5; retrying prompt

[diagnostic] lane wait exceeded: lane=session:<redacted-telegram-topic-session> waitedMs=115269 queueAhead=1

[context-overflow-diag] sessionKey=<redacted-telegram-topic-session> provider=openai-codex/gpt-5.5 source=assistantError messages=45 compactionAttempts=0 observedTokens=unknown error=Context overflow: estimated context size exceeds safe threshold during tool loop.
[session-write-lock] releasing lock held for 21911ms (max=15000ms): <openclaw-agent-sessions>/sessions.json.lock
[compaction] rotated active transcript after compaction (sessionKey=<redacted-telegram-topic-session>)
auto-compaction succeeded for openai-codex/gpt-5.5; retrying prompt

[diagnostic] lane wait exceeded: lane=session:<redacted-telegram-topic-session> waitedMs=210096 queueAhead=1

# User-visible agent output during the same incident stated that HEARTBEAT_OK was delivered through the ordinary answer path:
# "HEARTBEAT_OK came not from normal heartbeat delivery; async command completed relay output entered ordinary agent path; model answered HEARTBEAT_OK; because not recognized as heartbeat-run it was sent as ordinary final answer."

[context-overflow-diag] sessionKey=<redacted-telegram-topic-session> provider=openai-codex/gpt-5.5 source=assistantError messages=57 compactionAttempts=0 observedTokens=unknown error=Context overflow: estimated context size exceeds safe threshold during tool loop.
[session-write-lock] releasing lock held for 20517ms (max=15000ms): <openclaw-agent-sessions>/sessions.json.lock
[compaction] rotated active transcript after compaction (sessionKey=<redacted-telegram-topic-session>)
auto-compaction succeeded for openai-codex/gpt-5.5; retrying prompt

[agent/embedded] embedded run agent end: isError=true model=gpt-5.5 provider=openai-codex error=One of "input" or "previous_response_id"or 'prompt'or 'conversation_id' must be provided.
session file repaired: rewrote 1 assistant message(s), dropped 1 blank user message(s) (<redacted-session-file>)

[diagnostic] stuck session: sessionId=<redacted-agent> sessionKey=<redacted-telegram-topic-session> state=processing age=138s queueDepth=1
[diagnostic] stuck session: sessionId=<redacted-agent> sessionKey=<redacted-telegram-topic-session> state=processing age=168s queueDepth=1
[diagnostic] stuck session: sessionId=<redacted-agent> sessionKey=<redacted-telegram-topic-session> state=processing age=198s queueDepth=1

# A later user-visible agent message described the same failure class:
# approved exec -> command finished -> OpenClaw tries to resume agent session -> resume/followup fails -> service text is sent to Telegram

Related existing issues found before filing:

This report adds a Telegram topic/approval-expiry/context-overflow instance of the same general failure area.

Impact and severity

Affected: Telegram topic/thread sessions using a topic-specific agent with exec/tool approval flows.

Severity: High for affected sessions. The gateway stayed up, but the topic session became delayed/stuck for several minutes and exposed internal orchestration text to end users.

Frequency: Observed once in this inspected incident window, with three context-overflow/compaction cycles and multiple stuck-session diagnostics in the same session.

Consequence: User-visible channel received internal heartbeat/followup content, approval actions expired, session state required auto-repair, and the affected topic appeared broken even though the gateway process and RPC remained healthy.

Additional information

The gateway was still healthy after the incident (active/running, RPC ok), so this does not appear to be a systemd or process-crash issue. The failure seems localized to the Telegram session turn lifecycle around context overflow, auto-compaction, approved exec followup, and heartbeat/no-op delivery suppression.

No raw session transcript, raw prompt text, chat IDs, session IDs, hostnames, IP addresses, auth material, or user-identifying data are included in this report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions