Skip to content

memory-core: memory_search blocks event loop for 60+ seconds — Discord gateway closes, agents hang #81172

@dev23xyz-oss

Description

@dev23xyz-oss

Bug Type

Crash (process/app hangs)

Summary

The memory-core plugin's memory_search tool blocks the Node.js event loop for 60+ seconds when processing Discord DMs, causing:

  • Event loop delay spikes to 62,000ms+
  • Discord gateway closes connection
  • Agent becomes unresponsive to all messages
  • No self-recovery — requires restart

Reproducible: 100% on Discord DMs

Environment

  • OpenClaw: 2026.5.7
  • memory-core: 2026.5.7 (bundled)
  • OS: Ubuntu 24.04 (Linux 6.8.0-106-generic)
  • Node.js: v22.x
  • Plugin: @openclaw/memory-core (bundled)

Evidence from Production Logs

```
2026-05-12T20:19:37.124+00:00 [diagnostic] liveness warning: reasons=event_loop_delay interval=80s
eventLoopDelayP99Ms=100.2
eventLoopDelayMaxMs=62746.8
eventLoopUtilization=0.818

work=[active=agent:main:discord:default:direct:1493259950661697657(
processing/tool_call,
q=1,
age=63s
last=tool:memory_search:started
)]
```

Key indicators:

  • Tool: memory_search:started
  • Age: 63 seconds and counting
  • Event loop delay: 62,746.8ms
  • Result: Gateway websocket closed, agent frozen

Steps to Reproduce

  1. Enable memory-core plugin (default enabled)
  2. Send Discord DM to agent
  3. Agent begins processing, calls memory_search
  4. Event loop blocks for 60+ seconds
  5. Discord gateway timeout → connection closes
  6. Agent hangs indefinitely

Expected Behavior

`memory_search` should:

  • Complete or timeout within reasonable time (<10s)
  • Not block the event loop
  • Allow Discord gateway to remain responsive
  • Handle embedding failures gracefully

Actual Behavior

Metric Value
Event loop delay 62,746ms
memory_search duration 63+ seconds
Discord gateway Closes connection
Agent state Frozen, requires restart

Workaround Deployed

Agent Watchdog — Monitors every 2 minutes and auto-restarts hung agents

Related Issues

Same root cause pattern:

Common pattern: Blocking embedded operations on single-process Node.js event loop starves I/O.

Impact

  • 17 agents affected fleet-wide
  • Discord DMs (primary user interaction) broken
  • Agents appear "frozen", no response
  • Recovery requires watchdog (2 min delay)

Suggested Fix

  1. Make `memory_search` non-blocking (worker threads / child process)
  2. Add configurable timeout (`memory_search.timeoutMs` with 5-10s default)
  3. Fail gracefully (return empty results rather than block)
  4. Async embedding (don't block event loop during vector queries)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions