Skip to content

[Bug]: After enabling the gateway, it keeps timing out and reconnecting repeatedly #75944

@yyds-xxxx

Description

@yyds-xxxx

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

I upgraded sequentially from April 23 to April 27, then to April 29.

On April 23: The response speed was fast, the gateway started normally, with no timeout errors or minor anomalies.

On April 27: The gateway still started normally, but it kept throwing timeout errors in diagnostics. The core functions were unaffected, yet the response latency increased by about 5 seconds.

On April 29: The gateway could start up, but it suffered from repeated error reports and constant reconnections after startup. I sent one message in the evening and went to bed; when I checked the next morning, there was still no reply and the system was completely frozen. It froze entirely whether sending messages via the UI interface or the chat client

Steps to reproduce

I can only describe the freezing process. I sent a message saying "Hello". The gateway showed a responsive state, but did not return any final result — it only had a peripheral response without outputting a reply.

I will paste the error logs for your reference. I have no idea how to resolve this issue on my own.
Running  doctor --fix  prompted that all issues were fixed, but there was no actual improvement, and it still freezes completely.

I won’t paste the full logs. The core fault is repeated reconnection attempts, and even after reconnection succeeds, it still times out and keeps reconnecting in a loop.

11:23:58 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=136s eventLoopDelayP99Ms=116769.4 eventLoopDelayMaxMs=116769.4 eventLoopUtilization=1 cpuCoreRatio=0.994 active=0 waiting=0 queued=0
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
11:23:58 [diagnostic] lane task error: lane=main durationMs=421749 error="CommandLaneTaskTimeoutError: Command lane "main" task timed out after 330000ms"
11:23:58 [diagnostic] lane task error: lane=session:agent:main:feishu:direct:ou_46455b0cca06b766aeef317a259 durationMs=421758 error="CommandLaneTaskTimeoutError: Command lane "main" task timed out after 330000ms"

Expected behavior

I think versions 4.22 and 4.23 have no major issues. The response speed is fast, and there were no error reports during usage.

Actual behavior

I sent a message saying "Hello". The gateway gives a preliminary response but never replies afterward. Even this simple command is now getting stuck.

OpenClaw version

2026.4.29

Operating system

Windows11

Install method

npm

Model

minimax2.7

Provider / routing chain

OpenClaw -> Local AI Gateway -> MiniMax(Monthly Subscription)

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Can't send the picture, so I'll just copy and paste it directly.

This result keeps looping repeatedly with no response at all.


11:38:09 [agent/embedded] agent cleanup timed out: runId=02b7b693-511d-4a0a-88ff-ddb1e16ef746 sessionId=ec720b46-a088-45bc-bc4c-c055d683f9c5 step=pi-trajectory-flush timeoutMs=10000
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
[error]: [ '[ws]', 'timeout of 15000ms exceeded' ]
[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]
11:38:09 [ws] ⇄ res ✓ node.list 131869ms conn=e7f04cda…baca id=3ae8db85…4b6b
11:38:10 [agent/embedded] embedded run failover decision: runId=02b7b693-511d-4a0a-88ff-ddb1e16ef746 stage=assistant decision=surface_error reason=timeout from=minimax-portal/MiniMax-M2.7 profile=sha256:9e08bd6be9c1
11:39:21 [plugins] memory-core: managed dreaming cron could not be reconciled (cron service unavailable).
11:43:31 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=256s eventLoopDelayP99Ms=244544.7 eventLoopDelayMaxMs=244544.7 eventLoopUtilization=1 cpuCoreRatio=0.997 active=0 waiting=0 queued=0
11:45:03 [agent/embedded] [trace:embedded-run] startup stages: runId=f3a93c00-efeb-4214-aac7-3d0fcc8610c5 sessionId=824eae8e-718c-486b-a13b-72e92af78d85 phase=attempt-dispatch totalMs=206192 stages=workspace:0ms@0ms,runtime-plugins:3ms@3ms,hooks:0ms@3ms,model-resolution:23819ms@23822ms,auth:81464ms@105286ms,context-engine:0ms@105286ms,attempt-dispatch:100906ms@206192ms
[error]: [
  '[ws]',
  'Client network socket disconnected before secure TLS connection was established'
]
[info]: [ 'ws', 'unable to connect to the server after trying 3 times")' ]
[error]: [
  '[ws]',
  'Client network socket disconnected before secure TLS connection was established'
]
[info]: [ 'ws', 'unable to connect to the server after trying 3 times")' ]
[error]: [
  '[ws]',
  'Client network socket disconnected before secure TLS connection was established'
]
[info]: [ 'ws', 'unable to connect to the server after trying 3 times")' ]
[error]: [
  '[ws]',
  'Client network socket disconnected before secure TLS connection was established'
]
[info]: [ 'ws', 'unable to connect to the server after trying 3 times")' ]
[error]: [
  '[ws]',
  'Client network socket disconnected before secure TLS connection was established'
]
[info]: [ 'ws', 'unable to connect to the server after trying 3 times")' ]
11:45:03 [ws] ⇄ res ✓ node.list 88984ms conn=e7f04cda…baca id=e9b9289b…ef97
11:46:58 [tools] agents.main.tools.allow allowlist contains unknown entries (gateway, nodes). These entries are shipped core tools but unavailable in the current runtime/provider/model/config.
[error]: [
  AxiosError: write ECONNABORTED
      at AxiosError.from (C:\Users\Lenovo\.openclaw\extensions\openclaw-lark\node_modules\axios\dist\node\axios.cjs:962:24)
      at RedirectableRequest.handleRequestError (C:\Users\Lenovo\.openclaw\extensions\openclaw-lark\node_modules\axios\dist\node\axios.cjs:3794:29)
      at RedirectableRequest.emit (node:events:508:28)
      at eventHandlers.<computed> (C:\Users\Lenovo\.openclaw\extensions\openclaw-lark\node_modules\follow-redirects\index.js:56:24)
      at ClientRequest.emit (node:events:508:28)
      at emitErrorEvent (node:_http_client:108:11)
      at TLSSocket.socketErrorListener (node:_http_client:575:5)
      at TLSSocket.emit (node:events:508:28)
      at emitErrorNT (node:internal/streams/destroy:170:8)
      at emitErrorCloseNT (node:internal/streams/destroy:129:3)
      at Axios.request (C:\Users\Lenovo\.openclaw\extensions\openclaw-lark\node_modules\axios\dist\node\axios.cjs:5110:41)
      at process.processTicksAndRejections (node:internal/process/task_queues:104:5) {
    isAxiosError: true,
    code: 'ECONNABORTED',
    config: {
      transitional: [Object],
      adapter: [Array],
      transformRequest: [Array],
      transformResponse: [Array],
      timeout: 0,
      xsrfCookieName: 'XSRF-TOKEN',
      xsrfHeaderName: 'X-XSRF-TOKEN',
      maxContentLength: -1,
      maxBodyLength: -1,
      env: [Object],
      validateStatus: [Function: validateStatus],
      headers: [Object [AxiosHeaders]],
      method: 'post',
      url: 'https://open.feishu.cn/open-apis/bot/v1/openclaw_bot/ping',
      data: '{"needBotInfo":true}',
      params: {},
      allowAbsoluteUrls: true
    },
    request: Writable {
      _events: [Object],
      _writableState: [WritableState],
      _maxListeners: undefined,
      _options: [Object],
      _ended: true,
      _ending: true,
      _redirectCount: 0,
      _redirects: [],
      _requestBodyLength: 20,

Impact and severity

After I send the message, there is no reply, and the core functionality is affected.

Additional information

I feel versions 4.22 and 4.23 are the most stable and best-performing releases in terms of response speed among all later updates.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions