Skip to content

[Bug]: memory-openviking plugin before_agent_start auto-recall hangs indefinitely, causing total silent agent failure #673

@VideoScape

Description

@VideoScape

Bug Description

The before_agent_start hook in the openclaw-memory-plugin calls OpenViking's search API for auto-recall before every agent run. The plugin defines autoRecallTimeoutMs = 5_000 and imports withTimeout from process-manager.ts, but neither is used to wrap the recall operation. If the search call hangs for any reason, the entire gateway agent pipeline blocks indefinitely with no timeout, no error, and no fallback.

Steps to Reproduce

Steps to Reproduce:

Install and configure memory-openviking in local mode with autoRecall: true
Start the OpenClaw gateway with the plugin loaded
Have OpenViking start successfully (port 1933 up, health check passing)
Send a message via the Control UI or openclaw gateway call chat.send
Observe that chat.send returns status: started but no response is ever produced

The hang is reliably triggered when OpenViking's Node.js client connection to the Python subprocess is in a transient state — for example, shortly after the gateway starts while OpenViking is still initialising, or after a gateway restart.

Expected Behavior

If the auto-recall search call does not complete within autoRecallTimeoutMs (5 seconds), the hook should time out gracefully, log a warning, and allow the agent to start without memory context.

Actual Behavior

The before_agent_start hook never completes. The gateway logs show execution stopping at [hooks] running before_agent_start (1 handlers, sequential) with no further output. No transcript is written, no API call is made, and no response is ever sent to the user. The failure is completely silent at normal log levels — --log-level debug is required to even observe where execution stops.

Minimal Reproducible Example

The fix is a one-block change in examples/openclaw-memory-plugin/index.ts. In the before_agent_start handler, the autoRecall block currently looks like:
typescriptif (cfg.autoRecall && queryText.length >= 5) {
  try {
    const precheck = await this.precheck();
    // ... recall logic ...
  } catch (err) {
    api.logger.warn(...);
  }
}
The try/catch only catches thrown errors, not hangs. The fix wraps the entire block with the already-defined withTimeout:
typescriptif (cfg.autoRecall && queryText.length >= 5) {
  try {
    await withTimeout(
      (async () => {
        const precheck = await this.precheck();
        // ... existing recall logic unchanged ...
      })(),
      autoRecallTimeoutMs,
      `memory-openviking: auto-recall timed out after ${autoRecallTimeoutMs}ms`,
    );
  } catch (err) {
    api.logger.warn(`memory-openviking: auto-recall failed or timed out: ${String(err)}`);
  }
}


`withTimeout` and `autoRecallTimeoutMs` are already defined in the file but never wired up to the recall path.

---


### Error Logs

```shell
No errors are logged at normal log levels. With `--log-level debug`, the gateway stops after:

[hooks] running before_agent_start (1 handlers, sequential)
No further output. The run never completes.

OpenViking Version

0.2.6 (PyPI), openclaw-memory-plugin from main branch

Python Version

3.13

Operating System

Linux

Model Backend

Other

Additional Context

Related to #527 (HTTP endpoint hangs causing AbortError in auto-capture), but distinct — this affects the before_agent_start hook rather than auto-capture, and the consequence is total agent failure rather than just missing memory saves. The two bugs share the same root pattern (no timeout on OpenViking client calls) but need separate fixes.
Impact severity: critical — any transient connection issue between the Node.js plugin and the Python OpenViking subprocess results in complete, permanent, undiagnosable agent silence until the gateway is restarted.

EDIT: also created the same issue for openclaw openclaw/openclaw/issues/48534

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions