Skip to content

[Performance] memory_search cold start ~5-7s — needs module pre-warming #70460

@bthaha123-Carl

Description

@bthaha123-Carl

Problem Description

memory_search tool has a significant cold start latency (~5-7 seconds) on first invocation after gateway starts. Subsequent calls are fast (~50ms).

Root Cause Analysis:

Each memory_search call goes through a dynamic import chain:

  • createMemorySearchTool
    • loadMemoryToolRuntime() [has module-level Promise cache]
      • tools.runtime-DFHmgPiY.js
        • memory-D1KN22vK.js
          • getMemorySearchManager()
            • MemoryIndexManager.get()
              • manager-runtime.js
                • manager-ChtEZeo0.js [embedding provider loading]
                  • memory-core-host-engine-foundation-CdUASPXi.js
                    • ONNX model / embedding model initialization

While loadMemoryToolRuntime() itself is cached (memoryToolRuntimePromise), the downstream embedding provider initialization (manager-runtime.js → manager-ChtEZeo0.js) has no caching and re-initializes on every cold gateway restart.

Additionally, the default memorySearch config enables hybrid search (0.7 vector + 0.3 FTS), which runs two search pipelines on every query.

Expected Behavior

Either:

  1. Module pre-warming: memory-core should pre-load its modules and embedding providers on gateway startup, so the first memory_search call is as fast as subsequent ones.
  2. Or: Expose a memory.warmup() or gateway.on('ready') hook that allows plugins/tasks to trigger pre-initialization.

Current Workaround

Creating a periodic cron task to invoke memory_search every 10 minutes has limited effectiveness because isolated sessions are fresh Node.js processes and the real bottleneck is in the gateway process module loading.

Environment

  • OpenClaw version: 2026.4.x
  • Node.js environment (Windows)
  • Memory backend: builtin (SQLite + local embedding)
  • Config: memorySearch.hybrid.enabled = true (default)

Suggested Solutions (Priority Order)

  1. Pre-warm on gateway startup: Add a startup hook in memory-core that calls getMemorySearchManager() with purpose: 'status' to trigger lazy initialization before first real query.
  2. Expose startup event: Provide a gateway.on('ready') or registerStartupTask() API so external plugins can register pre-warming tasks.
  3. Hybrid search opt-out: Document and make it easier to disable hybrid search for users who only need vector search, reducing cold start work by ~40%.
  4. Module-level caching: Cache the result of MemoryIndexManager.get() at the module level so repeated calls within the same gateway process reuse the initialized manager.

Impact

  • Current cold start: ~5700ms
  • After pre-warm / cached: ~50ms
  • This affects every new gateway session, making the first memory-dependent response significantly slower.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions