Skip to content

[Bug]: Telegram forum topic loses ACP/OpenCode routing after heavy bound turn; topic recovers only after gateway restart and then fails again under load #44492

@blghtr

Description

@blghtr

Bug type

Behavior bug (incorrect output/state without crash)

Summary

Title

Telegram forum topic loses ACP/OpenCode routing after heavy bound turn; topic recovers only after gateway restart and then fails again under load

Body

I’m seeing a topic-local failure in OpenClaw’s Telegram ACP thread binding.

Environment:

  • Host/runtime: OpenClaw Gateway running locally on Linux (WSL2, kernel 5.15), Node.js v22.22.1; gateway service is systemd managed and reported as running (openclaw status).
  • OpenClaw version/channel: stable channel, app/npm latest reported as 2026.3.11 (openclaw status).
  • Transport: Telegram bot channel enabled, using forum topics in group -1003351905082 (Waggelgroep); issue reproduced in topic context (this thread is topic 1).
  • Telegram routing config: channels.telegram.threadBindings.enabled=true and channels.telegram.threadBindings.spawnAcpSessions=true in ~/.openclaw/openclaw.json.
  • Group/topic policy: Telegram group policy is allowlisted (groupPolicy=allowlist), messages from authorized sender 5558998798; routing is topic-aware (forum thread context preserved).
  • ACP/OpenCode path: ACP runtime/plugin is enabled (acpx enabled in config); sessions are persisted under ~/.openclaw/agents/opencode/sessions/.
  • Session evidence of topic isolation: persisted OpenCode session metadata includes topic-scoped keys (groupId values like -1000000005082:topic:<id>) and explicit threadId, confirming per-topic routing context instead of global group routing.
  • Operational symptom: after high-volume / long bound turns in a topic, follow-up messages in the same topic stop routing to the bound ACP/OpenCode session; restarting gateway restores routing temporarily (slash-command work), but the very next message kills it again.

Strongest evidence from logs

  1. Message reaches Telegram gateway path

    • OpenClaw logs raw Telegram updates for the bound topic, including ordinary follow-up messages like ".".
  2. Message reaches ACP/OpenCode

    • OpenCode loads the persistent session, accepts POST /session/.../message, starts session.prompt step=0, and resolves tools.
  3. OpenCode remains alive after the topic appears dead

    • OpenCode continues emitting message.part.updated, message.part.delta, and tool/subagent activity after the handoff point.
  4. Telegram side wedges

    • OpenClaw logs typing TTL exceeded (60000ms), auto-stopping typing indicator instead of a normal usable completion in the topic.

Correlation with heavy turns

This seems much more likely on complex turns, especially when sub-agent/task delegation is involved. In the logs, heavier runs create a developer subagent session and generate a denser nested event stream. I’m treating this as a correlation, not definitive proof of cause.

Likely non-causes

  • This does not look like “OpenCode never launched”. There are logs proving the bound session received the message and continued processing.
  • This does not look like simple agent-permission inheritance from the orchestrator. The developer subagent is created and proceeds with its own work; what is denied there is further task delegation, not the whole execution path.
  • A separate Telegram chunking bug with --- exists, but that is a different issue; in my case, the more severe failure persists even after isolating around that. Telegram channel settings support different chunking/streaming modes, so this appears distinct from the already-known delivery fragility around formatting/preview. ([OpenClaw][1])

Working theory

This looks like a bug in the Telegram topic-bound ACP bridge/routing layer inside OpenClaw:

  • inbound topic message is accepted,
  • bound ACP session receives and processes it,
  • but outbound propagation or topic-local routing state wedges under heavier nested event streams,
  • and after that the topic may stop reaching OpenClaw at all until gateway restart.

Steps to reproduce

Minimal repro shape

  1. Bind an ACP/OpenCode session to a Telegram forum topic using /acp spawn ... --thread here|auto. This is documented behavior for Telegram topics. ([OpenClaw][1])
  2. Run a heavier task that may spawn sub-agents or produce a richer stream.
  3. The topic may stop producing usable output.
  4. Gateway restart makes the topic visible again.
  5. A subsequent heavy turn can wedge it again.

Expected behavior

Expected behavior

In a bound Telegram forum topic:

  1. A normal follow-up message should be accepted by the gateway.
  2. The bound ACP/OpenCode session should receive and process it.
  3. The resulting event stream should continue to be propagated back into the same Telegram topic.
  4. If sub-agents are spawned, they should complete and their result should be relayed back normally; OpenClaw documents sub-agents as separate runs that announce results back to the requester chat channel. ([OpenClaw][2])

Actual behavior

Actual behavior

There are two related failure modes.

First, on heavy turns, the topic often wedges:

  • Telegram update is received by OpenClaw.
  • The bound OpenCode session receives the message and starts processing it.
  • OpenCode continues to emit events and logs after handoff.
  • But the Telegram topic stops getting usable output and eventually only shows typing TTL exceeded.
    Evidence from logs: raw Telegram update for the topic is received; later only typing timeout is logged on the OpenClaw side, while OpenCode logs show the same session accepting the message, starting session.prompt, and continuing to emit message.part.updated / message.part.delta and tool activity.

Second, after the topic “falls over”, it appears to become topic-local dead:

  • Other chats/topics still work.
  • The failed topic no longer seems to reach OpenClaw until gateway restart.
  • After restart, the same topic can see its own ACP session again, but heavy work can make it fail again.

This topic-local character is consistent with Telegram topic binding being scoped to a specific topic peer rather than globally. ([OpenClaw][1])

OpenClaw version

2026.3.11

Operating system

Linux (WSL2, kernel 5.15)

Install method

npm global

Model

any

Provider / routing chain

Effective request path through gateways, proxies, providers, or model routers: 1. User posts in Telegram forum topic (chat_id=-1003351905082, topic_id=<topic>). 2. Telegram Bot API delivers update to local OpenClaw Telegram plugin/account (plastinka). 3. OpenClaw channel router resolves group+topic context and applies thread binding (channels.telegram.threadBindings.enabled=true). 4. OpenClaw maps topic context to ACP/OpenCode session (topic-scoped session key / groupId ...:topic:<id> with threadId in persisted session metadata). 5. Request is forwarded inside local OpenClaw Gateway (ws://127.0.0.1:18789, gateway mode local, bind lan) to the target agent runtime. 6. ACP plugin (acpx, enabled) executes session-bound agent workflow and dispatches model calls via configured model stack. 7. Model routing uses agent defaults: primary openai-codex/gpt-5.3-codex, fallback zai/glm-4.7 (router/fallback selection is internal to OpenClaw model layer). 8. Response returns through the same chain in reverse: model -> ACP runtime -> OpenClaw session/topic router -> Telegram plugin -> Telegram topic reply. Notes: - No external reverse proxy is configured in this path from available local config/status evidence. - Gateway restart reinitializes in-memory routing/session state, which temporarily restores topic-bound delivery before degradation recurs under load.

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbug:behaviorIncorrect behavior without a crash

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions