You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Behavior bug (latency and retained process memory after completed or timed-out Active Memory preflight)
Beta release blocker
No
Summary
Consolidated replacement for #83773 and original #83752, with the later live profiling evidence folded into the main issue body.
On a live Linux VPS running OpenClaw 2026.5.18 (50a2481), Telegram group-topic turns that trigger Active Memory preflight can sharply increase gateway parent RSS and leave it elevated after the turn completes, even when /readyz is healthy and OpenClaw reports 0 queued · 0 running.
The newest profiling datapoint narrows the retained RSS from a generic gateway memory symptom to a specific retained file-backed local embedding model mapping:
Important: this does not appear to be a local chat-model fallback. The live config showed chat routing as openai/gpt-5.5 primary with google/gemini-2.5-flash fallback. The local GGUF model is used by Memory Search / memory-core as the local embedding backend. Active Memory runs before the normal reply, performs memory search, and that memory-search path loads or touches the local embedding model in the gateway parent process. After the Active Memory timeout, that mapping remains resident while the gateway is otherwise idle.
Steps to reproduce
Run OpenClaw 2026.5.18 as a systemd user gateway with Telegram and Active Memory enabled.
Use a Telegram group topic/session where Active Memory is allowed for group/channel style sessions.
Configure Active Memory with queryMode: "message", timeoutMs: 5000, setupGraceTimeoutMs: 5000, and allowedChatTypes: ["direct", "group", "channel"].
Restart the gateway cleanly and wait for /readyz.
Record gateway parent RSS, RssAnon, RssFile, PSS, child RSS, cgroup memory, and OpenClaw task pressure while idle.
Send /active-memory on in the Telegram topic.
Send one short normal Telegram message in that topic.
Wait for the reply and then leave the gateway idle.
Re-check /readyz, task pressure, parent RSS/RssAnon/RssFile/PSS, child RSS, and top smaps mappings.
Compare against the clean post-restart baseline and/or repeat with Active Memory disabled in the same topic.
Expected behavior
Completed Telegram turns should not leave the gateway retaining hundreds of MB of extra RSS after the system is idle.
If Active Memory times out, it should release/clean up transient recall resources it owns and degrade the reply path without leaving a high retained RSS footprint. If the local memory-search embedding model is intentionally cached, that should be explicit and bounded so operators do not see an unexpected ~300 MB retained mapping after a timed-out Telegram preflight.
Actual behavior
The affected VPS repeatedly showed:
clean post-restart gateway parent RSS around 430-570 MB after settling;
Active Memory Telegram turns increasing parent RSS to around 1.0-1.08 GB;
/readyz healthy and task pressure 0 queued · 0 running while RSS stayed elevated;
a clean restart bringing the gateway back to the lower baseline;
during the newest controlled message / 5s test, the largest retained mapping after the timeout was the local Memory Search embedding GGUF file.
OpenClaw version
OpenClaw 2026.5.18 (50a2481)
Operating system
Ubuntu 24.04.3 LTS, Linux 6.17.0-1011-oracle, aarch64
Evidence: controlled profiling run with local embedding model mapping
No gateway restart, config change, hotfix, or heap snapshot was performed during this capture. A 2s sampler recorded /proc/<gateway-pid>/status, smaps_rollup, child RSS, and cgroup memory from 2026-05-18T22:16:30Z to 2026-05-18T22:22:30Z.
Sent /active-memory on in the same Telegram group topic.
Sent one short normal Telegram message in that topic.
Waited for the reply and then left the gateway idle.
Relevant journal lines, redacted to behavior and timing:
22:16:45 inbound Telegram group/topic command, 17 chars
22:16:46 outbound send ok
22:16:58 inbound Telegram group/topic message, 58 chars
22:17:00 main embedded agent started
22:17:01 active-memory start timeoutMs=5000 queryChars=58 searchQueryChars=58
22:17:01 active-memory embedded run started
22:17:11 before_prompt_build handler from active-memory failed: timed out after 10000ms
22:17:12 active-memory done status=timeout elapsedMs=10236 summaryChars=0
22:17:40 Telegram sendMessage ok
Sampler summary:
samples: 175
first_ts: 2026-05-18T22:16:30Z
last_ts: 2026-05-18T22:22:30Z
rss_kb_min: 578956 at 22:16:59
rss_kb_max: 1029036 at 22:17:12
rss_kb_last: 997392 at 22:22:30
pss_kb_min: 524332 at 22:16:59
pss_kb_max: 976556 at 22:17:12
pss_kb_last: 942823 at 22:22:30
rss_anon_kb_min: 518232 at 22:16:59
rss_anon_kb_max: 648528 at 22:17:12
rss_anon_kb_last: 616420 at 22:22:30
rss_file_kb_min: 60724 at 22:16:30
rss_file_kb_max: 380972 at 22:17:09
rss_file_kb_last: 380972 at 22:22:30
vmdata_kb_min: 610664 at 22:16:59
vmdata_kb_max: 842720 at 22:17:12
vmdata_kb_last: 809992 at 22:22:30
child_rss_kb_min/max/last: 46032
cgroup_current_bytes_min: 604041216 at 22:16:59
cgroup_current_bytes_max: 879431680 at 22:17:26
cgroup_current_bytes_last: 725692416 at 22:22:30
cgroup_peak_bytes_max/last: 944881664
Other large mappings included [heap] around 59948 kB, /usr/bin/node around 59008 kB, and anonymous blocks. The 314 MB file-backed GGUF mapping was the largest single retained mapping.
Interpretation from this capture: a timed-out Active Memory preflight appears to load or local-touch the node-llama-cpp Memory Search embedding model in the gateway parent process; that model mapping remains resident after the Active Memory timeout, after the Telegram reply, and while /readyz is healthy with task pressure idle. This does not by itself prove the final fix, but it narrows the retained-RSS evidence from generic gateway RSS growth to a specific retained file-backed model mapping plus a smaller anonymous-memory increase.
Evidence: original full-context observation
Before clean restart on 2026.5.18:
RSS: ~1.4-1.6 GB
Memory diagnostic fired: rssBytes=1651253248 heapUsedBytes=498389504 thresholdBytes=1610612736
After clean restart:
~446 MB RSS shortly after ready
~509 MB RSS after ~90s
~570 MB RSS after ~6m45s
~566 MB RSS after ~9m27s
After one Telegram weather ask plus a follow-up log-check turn:
~1,001,404 kB RSS (~978 MiB)
readyz healthy
0 queued / 0 running
gateway process threads: 12
no child processes observed
swap: 0
Full-context timing:
20:25:15.970 inbound Telegram message received
20:25:21.200 embedded agent started (~5.2s after inbound)
20:25:23.381 Active Memory started
20:25:40.285 Active Memory finished: 16.9s, no relevant memory
20:25:44.319 Codex task started
20:26:26.677 wttr.in curl finished in ~80ms
20:26:43.561 final answer generated
20:26:47.728 Telegram sendMessage ok
Total inbound-to-Telegram-send: ~91.8s
Evidence: recent mode still reproduced
After switching from full/contextual/30000ms to recent/balanced/15000ms while keeping group/channel allowed:
The follow-up weather-style request in the same topic did not show Active Memory hook/log lines:
2026-05-18T21:01:58.576Z inbound Telegram group/topic message, 58 chars
2026-05-18T21:02:00.205Z main embedded agent started
no active-memory start/done lines for this request
2026-05-18T21:02:33.492Z Telegram sendMessage ok
Inbound-to-send was about 34.9s with Active Memory disabled for the topic, versus about 77.7s in the prior message / 5s Active Memory test and about 91.8s before tuning.
Evidence: retained high RSS until restart
At 2026-05-18T21:05:39Z after the Active Memory timeout tests:
immediately after restart: parent RSS 697,624 kB, service peak 667.1M
~90s after restart: parent RSS 483,656 kB, systemd service memory 428.3M, readyz healthy
So the high value was not the normal clean-start baseline on this host. It was retained runtime state after the Telegram/Active Memory tests, and a clean restart brought it back to the 430-480 MB range.
Current-code notes from previous ClawSweeper review
A previous ClawSweeper review on #83773 noted these source-level facts against current main at the time:
Active Memory computes the embedded recall run timeout/watchdog as config.timeoutMs + config.setupGraceTimeoutMs, matching the observed 5000ms + 5000ms path surfacing as a 10000ms timeout.
Hook timeout does not cancel underlying plugin work by itself; timed-out modifying hooks are logged and skipped while the plugin's underlying work is not automatically cancelled, so cleanup must come from Active Memory and embedded-run abort handling.
The prompt-build hook is fail-open; replies continue while latency and RSS are the problem.
Comparing v2026.5.18 to current main showed no Active Memory behavior change that would obviously resolve retained RSS.
Affected: live gateways using Telegram group topics plus Active Memory on persistent conversations, especially where Active Memory is allowed for group/channel sessions and Memory Search uses the local embedding backend.
Severity: Medium. The gateway remained healthy on this VPS because the host has enough RAM, but RSS crossed OpenClaw's own diagnostic threshold before restart and can grow back quickly after user-visible turns.
Frequency: Observed repeatedly as high peaks across multiple recent versions on this VPS. The most recent controlled run reproduced the RSS jump and retained local embedding model mapping with a single Telegram message after /active-memory on.
Consequence: higher steady-state memory footprint, possible memory pressure on smaller hosts, and slow Telegram replies because Active Memory is a blocking pre-reply step.
Related open memory issue mentioned by ClawSweeper: Memory: Session files loaded entirely into memory via readFileSync #69451, but this report has a narrower Telegram + Active Memory + local Memory Search embedding trigger and should not be closed as a duplicate of session-file memory growth without further proof.
Adjacent open PR found during contributor duplicate scan: Bound active-memory recall latency and jitter QMD startup #73667 (Bound active-memory recall latency and jitter QMD startup). It was draft/conflicting and ClawSweeper flagged a timeout regression/no real behavior proof, so it should not currently be treated as the canonical fix for this report.
What would help validate a fix
A good fix/proof should ideally capture before/after values for:
RSS, PSS, RssAnon, RssFile, heapUsed, external, arrayBuffers, active handles, child RSS, and task pressure before and after idle;
Active Memory enabled vs disabled in the same Telegram topic;
queryMode: message, recent, and ideally full if safe;
whether configured timeoutMs vs setupGraceTimeoutMs behavior is intentional or accidentally doubling the user-visible timeout;
whether timed-out Active Memory recall work is actually cancelled or merely skipped by the hook layer;
whether local Memory Search embedding resources are intentionally cached in the gateway parent and, if so, whether there is a configurable/bounded unload or cache policy;
whether the retained embeddinggemma-300m-qat-Q8_0.gguf mapping returns near the idle baseline after completed/timed-out Active Memory recall runs.
Bug type
Behavior bug (latency and retained process memory after completed or timed-out Active Memory preflight)
Beta release blocker
No
Summary
Consolidated replacement for #83773 and original #83752, with the later live profiling evidence folded into the main issue body.
On a live Linux VPS running
OpenClaw 2026.5.18 (50a2481), Telegram group-topic turns that trigger Active Memory preflight can sharply increase gateway parent RSS and leave it elevated after the turn completes, even when/readyzis healthy and OpenClaw reports0 queued · 0 running.The newest profiling datapoint narrows the retained RSS from a generic gateway memory symptom to a specific retained file-backed local embedding model mapping:
Important: this does not appear to be a local chat-model fallback. The live config showed chat routing as
openai/gpt-5.5primary withgoogle/gemini-2.5-flashfallback. The local GGUF model is used by Memory Search /memory-coreas the local embedding backend. Active Memory runs before the normal reply, performs memory search, and that memory-search path loads or touches the local embedding model in the gateway parent process. After the Active Memory timeout, that mapping remains resident while the gateway is otherwise idle.Steps to reproduce
2026.5.18as a systemd user gateway with Telegram and Active Memory enabled.queryMode: "message",timeoutMs: 5000,setupGraceTimeoutMs: 5000, andallowedChatTypes: ["direct", "group", "channel"]./readyz./active-memory onin the Telegram topic./readyz, task pressure, parent RSS/RssAnon/RssFile/PSS, child RSS, and topsmapsmappings.Expected behavior
Completed Telegram turns should not leave the gateway retaining hundreds of MB of extra RSS after the system is idle.
If Active Memory times out, it should release/clean up transient recall resources it owns and degrade the reply path without leaving a high retained RSS footprint. If the local memory-search embedding model is intentionally cached, that should be explicit and bounded so operators do not see an unexpected ~300 MB retained mapping after a timed-out Telegram preflight.
Actual behavior
The affected VPS repeatedly showed:
/readyzhealthy and task pressure0 queued · 0 runningwhile RSS stayed elevated;message/ 5s test, the largest retained mapping after the timeout was the local Memory Search embedding GGUF file.OpenClaw version
OpenClaw 2026.5.18 (50a2481)Operating system
Ubuntu 24.04.3 LTS, Linux
6.17.0-1011-oracle,aarch64Install method
System-global npm install:
Model and routing
Normal chat/model routing from the live gateway config:
No local chat/model fallback was found in the checked config.
Memory Search configuration from the same live gateway:
openclaw memory status --deepconfirmed the local embedding model used by Memory Search:For the
codexagent, the same local embedding backend was configured, though with no indexed chunks at the time checked:Enabled plugins and local state size
Active Memory configurations tested
Initial heavy configuration:
{ "agents": ["main"], "allowedChatTypes": ["direct", "group", "channel"], "enabled": true, "logging": true, "maxSummaryChars": 220, "persistTranscripts": false, "promptStyle": "contextual", "queryMode": "full", "setupGraceTimeoutMs": 30000, "timeoutMs": 30000 }Reduced configuration, still reproduced:
{ "queryMode": "recent", "promptStyle": "balanced", "timeoutMs": 15000, "setupGraceTimeoutMs": 15000, "allowedChatTypes": ["direct", "group", "channel"] }Lowest-latency controlled Telegram-topic configuration, still reproduced:
{ "queryMode": "message", "timeoutMs": 5000, "setupGraceTimeoutMs": 5000, "allowedChatTypes": ["direct", "group", "channel"], "persistTranscripts": false, "logging": true }Evidence: controlled profiling run with local embedding model mapping
No gateway restart, config change, hotfix, or heap snapshot was performed during this capture. A 2s sampler recorded
/proc/<gateway-pid>/status,smaps_rollup, child RSS, and cgroup memory from2026-05-18T22:16:30Zto2026-05-18T22:22:30Z.Baseline immediately before the controlled run:
Test sequence:
/active-memory onin the same Telegram group topic.Relevant journal lines, redacted to behavior and timing:
Sampler summary:
Idle state after the sampler finished:
Top retained mapping after the timeout:
Other large mappings included
[heap]around 59948 kB,/usr/bin/nodearound 59008 kB, and anonymous blocks. The 314 MB file-backed GGUF mapping was the largest single retained mapping.Interpretation from this capture: a timed-out Active Memory preflight appears to load or local-touch the node-llama-cpp Memory Search embedding model in the gateway parent process; that model mapping remains resident after the Active Memory timeout, after the Telegram reply, and while
/readyzis healthy with task pressure idle. This does not by itself prove the final fix, but it narrows the retained-RSS evidence from generic gateway RSS growth to a specific retained file-backed model mapping plus a smaller anonymous-memory increase.Evidence: original full-context observation
Full-context timing:
Evidence:
recentmode still reproducedAfter switching from
full/contextual/30000mstorecent/balanced/15000mswhile keeping group/channel allowed:Then one Telegram weather ask plus one log-check follow-up:
RSS after those turns:
Evidence:
messagemode with 5s timeout still reproducedAfter tuning to
queryMode: "message",timeoutMs: 5000,setupGraceTimeoutMs: 5000, group/channel still allowed:RSS moved from roughly 590-607 MB before this turn to a peak around 1.0-1.07 GB during/after the Active Memory timeout.
Then
/active-memory offwas sent in the same Telegram topic:The follow-up weather-style request in the same topic did not show Active Memory hook/log lines:
Inbound-to-send was about 34.9s with Active Memory disabled for the topic, versus about 77.7s in the prior
message/ 5s Active Memory test and about 91.8s before tuning.Evidence: retained high RSS until restart
At
2026-05-18T21:05:39Zafter the Active Memory timeout tests:After restarting an idle gateway:
So the high value was not the normal clean-start baseline on this host. It was retained runtime state after the Telegram/Active Memory tests, and a clean restart brought it back to the 430-480 MB range.
Current-code notes from previous ClawSweeper review
A previous ClawSweeper review on #83773 noted these source-level facts against current main at the time:
config.timeoutMs + config.setupGraceTimeoutMs, matching the observed5000ms + 5000mspath surfacing as a10000mstimeout.Impact and severity
Affected: live gateways using Telegram group topics plus Active Memory on persistent conversations, especially where Active Memory is allowed for group/channel sessions and Memory Search uses the local embedding backend.
Severity: Medium. The gateway remained healthy on this VPS because the host has enough RAM, but RSS crossed OpenClaw's own diagnostic threshold before restart and can grow back quickly after user-visible turns.
Frequency: Observed repeatedly as high peaks across multiple recent versions on this VPS. The most recent controlled run reproduced the RSS jump and retained local embedding model mapping with a single Telegram message after
/active-memory on.Consequence: higher steady-state memory footprint, possible memory pressure on smaller hosts, and slow Telegram replies because Active Memory is a blocking pre-reply step.
Related / not duplicate notes
Bound active-memory recall latency and jitter QMD startup). It was draft/conflicting and ClawSweeper flagged a timeout regression/no real behavior proof, so it should not currently be treated as the canonical fix for this report.What would help validate a fix
A good fix/proof should ideally capture before/after values for:
queryMode: message,recent, and ideallyfullif safe;timeoutMsvssetupGraceTimeoutMsbehavior is intentional or accidentally doubling the user-visible timeout;embeddinggemma-300m-qat-Q8_0.ggufmapping returns near the idle baseline after completed/timed-out Active Memory recall runs.