Discord inbound worker repeatedly times out after 1800s while gateway is still running

## Summary

I'm seeing Discord replies fail with:

> Discord inbound worker timed out.

In the service logs this appears as:

```text
2026-04-26T10:04:19+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1497687704866000906, messageId=1497772780744347771)
2026-04-26T12:06:09+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1489624037758861415, messageId=1497803443287625800)
```

The gateway process itself remains active, but Discord inbound work appears to get stuck long enough to hit the 30 minute default timeout. Around the same periods I also see gateway/local worker health symptoms like websocket handshake timeouts, subagent announce timeouts, session locks, qmd timeouts, and context overflow recovery.

## What I expected

A Discord inbound message should either complete, fail with the underlying agent/model error, or surface enough diagnostic context to tell which internal worker/session is stuck.

If this timeout is expected behavior, it would help if the user-facing Discord reply included the agent/session/run id or a clearer reason than just `Discord inbound worker timed out.`

## What actually happens

The Discord channel gets a generic timeout reply after 1800 seconds.

Nearby logs show related pressure/errors:

```text
[ws] handshake timeout ... peer=127.0.0.1:...->127.0.0.1:18789
Subagent announce failed: Error: gateway timeout after 10000ms
[session-write-lock] releasing lock held for 66843ms / 71211ms / 97029ms
[memory] qmd embed failed ... timed out after 600000ms
[agent/embedded] [context-overflow-diag] ... Context overflow: estimated context size exceeds safe threshold during tool loop.
```

There were also Discord gateway reconnect/session churn events in the same general window:

```text
[discord] gateway error: Error: socket hang up
[discord] gateway: Gateway websocket closed: 1006
[discord] gateway: Gateway reconnect scheduled ... (invalid-session, resume=false)
```

## System/config context

- OpenClaw: `2026.4.24 (46d2415)`
- OS: LMDE 6 / Debian kernel `6.1.0-44-amd64`
- Node: `v24.15.0`
- npm: `11.12.1`
- Codex CLI: `0.120.0`
- Running as user systemd service: `openclaw-gateway.service`
- Gateway mode: local loopback, port `18789`
- Discord enabled with multiple accounts/bots
- Discord `healthMonitor.enabled=false`
- Discord `threadBindings.enabled=true`
- Discord `threadBindings.spawnSubagentSessions=true`
- Discord `threadBindings.spawnAcpSessions=true`
- Agent defaults:
  - `contextTokens=120000`
  - primary model `openai-codex/gpt-5.5`
  - `timeoutSeconds=3600`
  - `subagents.maxConcurrent=5`
  - `subagents.maxChildrenPerAgent=5`
  - `subagents.announceTimeoutMs=300000`
  - compaction reserve floor `24000`
- Discord inbound worker timeout appears to be using the default `1800000ms` / `1800s`; I did not find an explicit per-account `channels.discord.accounts.<id>.inboundWorker.runTimeoutMs` override in my config.

At the time of inspection the service was still running but using substantial resources:

```text
Tasks: 80
Memory: 6.5G
CPU: 3d+ accumulated
```

## Why I think this might be an OpenClaw issue

The timeout itself is documented/configured, but in practice it seems to be acting as the only visible failure mode for several possible internal stalls:

- queued Discord inbound run stuck behind session locks
- qmd embed/search/update timeouts
- subagent announce timeouts
- local gateway websocket handshake timeouts
- context overflow recovery taking a long time or looping

It would be useful if the inbound worker timeout carried the underlying run/session state into the Discord error reply and logs, or if the queue could cancel/unblock the stuck worker more cleanly before the full 1800s elapses.

## Possible improvement

When the inbound worker times out, include something like:

- account id / agent id
- session key
- run id
- queue depth
- whether the agent was waiting on model, tool, qmd, session lock, or gateway connect
- whether the timeout came from default `inboundWorker.runTimeoutMs` or an explicit account override

That would make this much easier to diagnose from Discord without digging through journal logs.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discord inbound worker repeatedly times out after 1800s while gateway is still running #71948

Summary

What I expected

What actually happens

System/config context

Why I think this might be an OpenClaw issue

Possible improvement

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Discord inbound worker repeatedly times out after 1800s while gateway is still running #71948

Description

Summary

What I expected

What actually happens

System/config context

Why I think this might be an OpenClaw issue

Possible improvement

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions