Skip to content

[Bug]: gateway RSS grows sharply after Telegram turns using Active Memory full-context preflight #83752

@brokemac79

Description

@brokemac79

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

On a live 2026.5.18 Linux VPS, gateway RSS dropped from ~1.4-1.6 GB to ~450-570 MB after a clean restart, then climbed back to ~1.0 GB after a small number of Telegram group-topic turns that ran Active Memory full-context preflight.

Steps to reproduce

  1. Run OpenClaw 2026.5.18 as a systemd user gateway with Telegram and Active Memory enabled.
  2. Configure Active Memory for main with allowedChatTypes: ["direct", "group", "channel"], queryMode: "full", promptStyle: "contextual", and timeoutMs: 30000.
  3. Restart the gateway cleanly and wait for /readyz.
  4. Record parent gateway RSS shortly after restart and after a few minutes idle.
  5. Send a simple Telegram group-topic request, for example a weather request, and wait for the reply.
  6. Send/trigger one follow-up diagnostic/log-check turn in the same Telegram topic.
  7. Record parent gateway RSS again when tasks list reports 0 queued · 0 running and /readyz is healthy.

Expected behavior

After short Telegram turns complete and no background work is queued/running, the gateway should return close to its clean post-restart baseline or otherwise remain well below the RSS warning threshold.

Active Memory may increase latency and transient memory while it runs, especially in full query mode, but completed turns should not leave the parent gateway retaining hundreds of MB of additional RSS after the system is idle again.

Actual behavior

RSS dropped sharply after a clean restart, then climbed again after the Telegram turns and did not immediately return to the clean baseline while the gateway was otherwise healthy and idle.

Observed sequence:

Before clean restart on 2026.5.18:
RSS: ~1.4-1.6 GB
Memory diagnostic fired: rssBytes=1651253248 heapUsedBytes=498389504 thresholdBytes=1610612736

After clean restart:
~446 MB RSS shortly after ready
~509 MB RSS after ~90s
~570 MB RSS after ~6m45s
~566 MB RSS after ~9m27s

After one Telegram weather ask plus a follow-up log-check turn:
~1,001,404 kB RSS (~978 MiB)
readyz healthy
0 queued / 0 running
gateway process threads: 12
no child processes observed
swap: 0

The same weather turn was also slow for a simple request. Active Memory added ~16.9s before the main model/tool work began.

OpenClaw version

OpenClaw 2026.5.18 (50a2481)

Operating system

Ubuntu 24.04.3 LTS, Linux 6.17.0-1011-oracle, aarch64

Install method

System-global npm install:

node=v22.22.0
npm=10.9.4
install root=/usr/lib/node_modules/openclaw
service=/home/ubuntu/.config/systemd/user/openclaw-gateway.service
ExecStart=/usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789

Model

openai/gpt-5.5

Provider / routing chain

OpenClaw gateway -> OpenAI Codex/ChatGPT auth, effective Codex harness for agent turns.

Additional provider/model setup details

Active Memory config from the affected gateway:

{
  "enabled": true,
  "config": {
    "agents": ["main"],
    "allowedChatTypes": ["direct", "group", "channel"],
    "enabled": true,
    "logging": true,
    "maxSummaryChars": 220,
    "persistTranscripts": false,
    "promptStyle": "contextual",
    "queryMode": "full",
    "setupGraceTimeoutMs": 30000,
    "timeoutMs": 30000
  }
}

Enabled plugin summary:

plugins list: 92 known, 11 enabled/loaded, 0 errors
enabled: active-memory, anthropic, file-transfer, google, memory-core, memory-wiki, ollama, openai, telegram, tokenjuice, codex
codex plugin source: ~/.openclaw/npm/node_modules/@openclaw/codex
codex plugin version: 2026.5.6

State/session sizes on the affected VPS:

~/.openclaw total: 7.7G
~/.openclaw/agents/main/sessions: 376M, 687 jsonl files
~/.openclaw/agents/codex/sessions: 368K, 10 jsonl files
~/.openclaw/agents/main/agent/codex-home/logs_2.sqlite: 714,862,592 bytes

Logs, screenshots, and evidence

# Pre-clean-restart sample, same 2026.5.18 process after upgrade/live use
PID     ELAPSED   RSS     VSZ      %MEM %CPU CMD
1185360 17:15     1413876 45429300 5.7  13.5 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:      1413876 kB
RssAnon:     718128 kB
RssFile:     695748 kB
Pss:        1044683 kB

readyz: {"ready":true,"failing":[]}
# Memory diagnostic from the same process before clean restart
2026-05-18T19:54:12.160+00:00 [diagnostics/memory] memory pressure: level=warning reason=rss_threshold rssBytes=1651253248 heapUsedBytes=498389504 thresholdBytes=1610612736
# After clean systemctl --user restart openclaw-gateway
# 15s after ready
PID     ELAPSED RSS    VSZ      %MEM %CPU CMD
1210550 00:31   446128 43737992 1.8  54.3 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:   446128 kB
RssAnon: 385516 kB
RssFile:  60612 kB

# ~90s after ready
PID     ELAPSED RSS    VSZ      %MEM %CPU CMD
1210550 01:31   509024 43784600 2.0  19.9 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:   509024 kB
RssAnon: 448412 kB
RssFile:  60612 kB

# ~6m45s after restart
PID     ELAPSED RSS    VSZ      %MEM %CPU CMD
1210550 06:45   569624 43801564 2.3  5.3 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:   569624 kB
RssAnon: 509012 kB
RssFile:  60612 kB
readyz: {"ready":true,"failing":[]}
task pressure: 0 queued · 0 running
# Telegram weather turn timing
20:25:15.970 inbound Telegram message received
20:25:21.200 embedded agent started (~5.2s after inbound)
20:25:23.381 Active Memory started
20:25:40.285 Active Memory finished: 16.9s, no relevant memory
20:25:44.319 Codex task started
20:26:26.677 wttr.in curl finished in ~80ms
20:26:43.561 final answer generated
20:26:47.728 Telegram sendMessage ok
Total inbound-to-Telegram-send: ~91.8s
# Active Memory timings from recent gateway logs
19:26:10 start -> 19:26:27 done elapsedMs=17911 status=ok
19:30:06 start -> 19:30:21 done elapsedMs=15327 status=no_relevant_memory
19:52:40 start -> 19:52:57 done elapsedMs=16719 status=no_relevant_memory
19:53:31 start -> 19:53:49 done elapsedMs=17464 status=ok
20:25:23 start -> 20:25:40 done elapsedMs=16905 status=no_relevant_memory
20:27:46 start -> 20:28:00 done elapsedMs=13440 status=ok
# After Telegram weather turn + follow-up log-check turn
timestamp=2026-05-18T20:33:52Z
PID     ELAPSED RSS     VSZ      %MEM %CPU CMD
1210550 30:28   1001404 44288252 4.0  4.2 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
VmRSS:      1001404 kB
RssAnon:     620480 kB
RssFile:     380924 kB
Pss:         946869 kB
VmSwap:           0 kB
Threads:         12
readyz: {"ready":true,"failing":[]}
task pressure: 0 queued · 0 running

Historical context from this VPS shows high gateway memory peaks existed before 2026.5.18, so this is not clearly a new 5.18-only regression:

2026.5.16-beta.3: 3.5G peak before the 5.18 upgrade; another run hit 6.5G
2026.5.14-beta.1: 5.0G peak
2026.5.12: up to 7.8G peak
2026.5.6: 3.6G peak
2026.5.3: up to 6.6G peak

Impact and severity

Affected: heavy live gateways using Telegram group topics plus Active Memory on persistent conversations, especially with queryMode: "full" and large existing session/log state.

Severity: Medium. The gateway remained healthy on this VPS because the host has plenty of RAM, but RSS crossed OpenClaw's own diagnostic threshold before restart and can grow back quickly after user-visible turns.

Frequency: Observed repeatedly as high peaks across multiple recent versions on this VPS. The exact minimal trigger is NOT_ENOUGH_INFO.

Consequence: Higher steady-state memory footprint, possible memory pressure on smaller hosts, and slow Telegram replies because Active Memory is a blocking pre-reply step.

Additional information

OpenClaw's Active Memory docs describe the safe/default setup as allowedChatTypes: ["direct"], queryMode: "recent", promptStyle: "balanced", and timeoutMs: 15000; this VPS intentionally uses a heavier setup for Telegram group/channel sessions with queryMode: "full" and timeoutMs: 30000.

This issue is not claiming the clean 2026.5.18 startup baseline is too high. After a clean restart, the parent gateway process was about 450-570 MB RSS. The concern is retained/accumulated RSS during live Telegram + Active Memory use.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:crash-loopCrash, hang, restart loop, or process-level availability failure.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions