Summary
After upgrading from 2026.4.24 to 2026.4.29, per-turn Telegram latency regressed from ~4s to ~15s on a fresh /new session AND on subsequent turns in the same session. Same config, same workspace, same hardware. Rolling back to 2026.4.24 restores ~4s latency.
The regression is per-turn (not just cold-start), and persists with plugins.entries["active-memory"].enabled = false, so the new Active Memory features in 4.29 don't appear to be the direct cause.
Environment
- Host: Mac mini M4, macOS 26.3
- Node: 22.22.2 (gateway), 25.8.2 (CLI)
- OpenClaw versions tested: 2026.4.1 (historical baseline, fast), 2026.4.24 (fast with active-memory disabled), 2026.4.29 (slow)
- Default model: anthropic/claude-opus-4-7
- 6 agents, 18 cron jobs, 4 Telegram bots
- Loopback gateway, single-user setup
- 49 plugins enabled
- RSS baseline 600-800 MB on 4.29
Reproduction
- On 2026.4.24 with plugins.entries["active-memory"].enabled = false: /new + "say hi" on Telegram completes in ~4s.
- Upgrade to 2026.4.29 via npm install -g openclaw@latest. Carry the same config forward (active-memory still disabled).
- Restart gateway: openclaw gateway install --force.
- Wait for ready, then /new + "say hi" on Telegram.
- Observed: ~15s. Second message in same session: also ~15s.
- Roll back to 2026.4.24: latency returns to ~4s.
What I ruled out
- Active Memory plugin firing: Verified via source read of plugin code on 4.24. With empty agents: [] allowlist (the default), isEnabledForAgent returns false on every turn → plugin loads but never invokes. Confirmed by zero traces in gateway logs. Setting enabled: false makes this even more conclusive.
- Cold-start cost only: Second turn in same session is also ~15s, so this is per-turn, not just cold prefill.
- Bootstrap context size: SOUL.md is 10,849 chars, well within the 12,000 cap. Per-turn context is the same as what 4.24 happily handles in 4s.
- Provider cooldown / fallback: Verified usageStats is empty in auth-profiles.json. No errorCount, no disabledUntil. Default model still anthropic/claude-opus-4-7.
- Cron jobs / sessions: All 18 cron jobs status:ok post-upgrade. Session count stable.
- Telegram deps: Reinstalled grammy, @grammyjs/runner, @grammyjs/transformer-throttler with --legacy-peer-deps post-upgrade per documented protocol.
Suspects (not yet isolated)
Features new in 4.25–4.29 that could add per-turn cost:
- Active-run steering by default (4.29 highlight)
- Visible-reply enforcement (4.29)
- Spawned subagent routing metadata (4.29)
- People-aware memory wiki + provenance views (4.29) — this is a separate code path from Active Memory and may have its own per-turn hook
- Per-conversation Active Memory filters (4.27/4.29) — could fire even with the plugin disabled if the filter check itself is per-turn
- Bounded REM preview diagnostics (4.29)
Without per-turn timing instrumentation at INFO log level (which doesn't exist), I can't isolate which subsystem is eating the seconds.
Asks
- Is per-turn timing observability planned? Provider call duration, bootstrap injection time, hook handler time at INFO level would make this kind of diagnosis tractable.
- Are any of the 4.29 features I listed enabled-by-default with their own per-turn LLM calls or sync I/O that 4.24 didn't have?
- Is there a config knob to disable the new memory wiki / REM preview / steering features for users who don't want them?
Workaround
Pinning to 2026.4.24 with plugins.entries["active-memory"].enabled = false. Latency is acceptable there.Happy to run additional diagnostics if it would help isolate. Have full SSH and CLI access, can capture debug-level logs of a single turn, can A/B specific config flags.
Summary
After upgrading from 2026.4.24 to 2026.4.29, per-turn Telegram latency regressed from ~4s to ~15s on a fresh /new session AND on subsequent turns in the same session. Same config, same workspace, same hardware. Rolling back to 2026.4.24 restores ~4s latency.
The regression is per-turn (not just cold-start), and persists with plugins.entries["active-memory"].enabled = false, so the new Active Memory features in 4.29 don't appear to be the direct cause.
Environment
Reproduction
What I ruled out
Suspects (not yet isolated)
Features new in 4.25–4.29 that could add per-turn cost:
Without per-turn timing instrumentation at INFO log level (which doesn't exist), I can't isolate which subsystem is eating the seconds.
Asks
Workaround
Pinning to 2026.4.24 with plugins.entries["active-memory"].enabled = false. Latency is acceptable there.Happy to run additional diagnostics if it would help isolate. Have full SSH and CLI access, can capture debug-level logs of a single turn, can A/B specific config flags.