Per-turn latency regression on 2026.4.29 vs 2026.4.24 with active-memory disabled (~4s → ~15s)

## Summary

After upgrading from 2026.4.24 to 2026.4.29, per-turn Telegram latency regressed from ~4s to ~15s on a fresh /new session AND on subsequent turns in the same session. Same config, same workspace, same hardware. Rolling back to 2026.4.24 restores ~4s latency.

The regression is per-turn (not just cold-start), and persists with plugins.entries["active-memory"].enabled = false, so the new Active Memory features in 4.29 don't appear to be the direct cause.

## Environment

- Host: Mac mini M4, macOS 26.3
- Node: 22.22.2 (gateway), 25.8.2 (CLI)
- OpenClaw versions tested: 2026.4.1 (historical baseline, fast), 2026.4.24 (fast with active-memory disabled), 2026.4.29 (slow)
- Default model: anthropic/claude-opus-4-7
- 6 agents, 18 cron jobs, 4 Telegram bots
- Loopback gateway, single-user setup
- 49 plugins enabled
- RSS baseline 600-800 MB on 4.29

## Reproduction

1. On 2026.4.24 with plugins.entries["active-memory"].enabled = false: /new + "say hi" on Telegram completes in ~4s.
2. Upgrade to 2026.4.29 via npm install -g openclaw@latest. Carry the same config forward (active-memory still disabled).
3. Restart gateway: openclaw gateway install --force.
4. Wait for ready, then /new + "say hi" on Telegram.
5. Observed: ~15s. Second message in same session: also ~15s.
6. Roll back to 2026.4.24: latency returns to ~4s.

## What I ruled out

- Active Memory plugin firing: Verified via source read of plugin code on 4.24. With empty agents: [] allowlist (the default), isEnabledForAgent returns false on every turn → plugin loads but never invokes. Confirmed by zero traces in gateway logs. Setting enabled: false makes this even more conclusive.
- Cold-start cost only: Second turn in same session is also ~15s, so this is per-turn, not just cold prefill.
- Bootstrap context size: SOUL.md is 10,849 chars, well within the 12,000 cap. Per-turn context is the same as what 4.24 happily handles in 4s.
- Provider cooldown / fallback: Verified usageStats is empty in auth-profiles.json. No errorCount, no disabledUntil. Default model still anthropic/claude-opus-4-7.
- Cron jobs / sessions: All 18 cron jobs status:ok post-upgrade. Session count stable.
- Telegram deps: Reinstalled grammy, @grammyjs/runner, @grammyjs/transformer-throttler with --legacy-peer-deps post-upgrade per documented protocol.

## Suspects (not yet isolated)

Features new in 4.25–4.29 that could add per-turn cost:

- Active-run steering by default (4.29 highlight)
- Visible-reply enforcement (4.29)
- Spawned subagent routing metadata (4.29)
- People-aware memory wiki + provenance views (4.29) — this is a separate code path from Active Memory and may have its own per-turn hook
- Per-conversation Active Memory filters (4.27/4.29) — could fire even with the plugin disabled if the filter check itself is per-turn
- Bounded REM preview diagnostics (4.29)

Without per-turn timing instrumentation at INFO log level (which doesn't exist), I can't isolate which subsystem is eating the seconds.

## Asks

1. Is per-turn timing observability planned? Provider call duration, bootstrap injection time, hook handler time at INFO level would make this kind of diagnosis tractable.
2. Are any of the 4.29 features I listed enabled-by-default with their own per-turn LLM calls or sync I/O that 4.24 didn't have?
3. Is there a config knob to disable the new memory wiki / REM preview / steering features for users who don't want them?

## Workaround

Pinning to 2026.4.24 with plugins.entries["active-memory"].enabled = false. Latency is acceptable there.Happy to run additional diagnostics if it would help isolate. Have full SSH and CLI access, can capture debug-level logs of a single turn, can A/B specific config flags.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Per-turn latency regression on 2026.4.29 vs 2026.4.24 with active-memory disabled (~4s → ~15s) #75906

Summary

Environment

Reproduction

What I ruled out

Suspects (not yet isolated)

Asks

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Per-turn latency regression on 2026.4.29 vs 2026.4.24 with active-memory disabled (~4s → ~15s) #75906

Description

Summary

Environment

Reproduction

What I ruled out

Suspects (not yet isolated)

Asks

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions