memory.qmd.update.embedTimeoutMs default (120 s) is too low for local GGUF; error message doesn't surface the fix

## Environment

- **OpenClaw version:** 2026.4.25
- **QMD version:** 2.1.0 (bab86d5)
- **Platform:** Ubuntu 24.04, GCP e2-standard-4 (4 vCPU, 16 GB RAM)
- **Workspace size:** 37+ Markdown files in memory root + `memory/` tree
- **GGUF model:** default (`embeddinggemma-300m-qat-Q8_0.gguf`, ~0.6 GB, auto-downloaded)
- **searchMode:** `query` (hybrid BM25 + vector)

## Observed behavior

After enabling hybrid search (`searchMode: "query"`), the gateway emits this warning repeatedly every 2–4 minutes:

```
{"subsystem":"memory","message":"qmd embed failed (boot): Error: qmd embed timed out after 120000ms; backing off for 60s"}
{"subsystem":"memory","message":"qmd embed failed (interval): Error: qmd embed timed out after 120000ms; backing off for 120s"}
```

The embed process runs at 100–186% CPU on 4 vCPU for ~2 minutes before being killed. Peak RAM during GGUF model load: **9.6 GB**. The embed never completes — every attempt times out at exactly 120 s. Vector search is effectively disabled.

## Root cause

The GGUF embedding model takes **3–4 minutes** on a 4-core CPU to embed a 37-file workspace. The default `memory.qmd.update.embedTimeoutMs` is **120 s** — less than the actual embed duration.

The fix (`memory.qmd.update.embedTimeoutMs: 600000`) is only discoverable via `openclaw config schema` or the memory config reference page. The error message does not mention it.

## Inconsistency with built-in engine

`agents.defaults.memorySearch.sync.embeddingBatchTimeoutSeconds` correctly differentiates:
- **600 s** for local/self-hosted providers (`local`, `ollama`, `lmstudio`)
- **120 s** for hosted providers (OpenAI, Gemini, etc.)

`memory.qmd.update.embedTimeoutMs` applies the same 120 s default regardless of whether the embedding model is a local GGUF or a hosted API. For local GGUF workloads, 120 s is consistently insufficient on commodity hardware.

## Suggested fixes (pick one or combine)

1. **Increase the default** for `embedTimeoutMs` to 600 s (matching the built-in engine's local-provider default), or detect when `searchMode` is `vsearch`/`query` with the default GGUF model and apply a longer default automatically.

2. **Improve the error message** to mention the fix:
   ```
   qmd embed timed out after 120000ms — consider increasing memory.qmd.update.embedTimeoutMs (current default: 120000)
   ```

3. **Add to the QMD troubleshooting docs** a dedicated entry:
   > **`qmd embed timed out after 120000ms`?** Increase `memory.qmd.update.embedTimeoutMs`. Default is 120 000 ms. On commodity hardware with the default GGUF model and a 30–50 file workspace, embedding takes 3–5 minutes. Set to `600000` or higher.

## Workaround

```json5
{
  memory: {
    qmd: {
      update: {
        embedTimeoutMs: 600000,  // 10 minutes — sufficient for ~40 files on 4-core CPU
      },
    },
  },
}
```

Or use a smaller/faster GGUF model via `QMD_EMBED_MODEL`:
```bash
export QMD_EMBED_MODEL="hf:Qwen/Qwen3-Embedding-0.6B-GGUF/Qwen3-Embedding-0.6B-Q8_0.gguf"
```

## Additional context

The gateway remains healthy throughout (Slack connected, Atlassian/GWS MCP tools working). The embed failures are WARN-level and non-blocking. However, with the default 120 s timeout, vector search is effectively permanently disabled on any commodity host with a medium or larger workspace, with no clear path to resolution from the error message alone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

memory.qmd.update.embedTimeoutMs default (120 s) is too low for local GGUF; error message doesn't surface the fix #74204

Environment

Observed behavior

Root cause

Inconsistency with built-in engine

Suggested fixes (pick one or combine)

Workaround

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

memory.qmd.update.embedTimeoutMs default (120 s) is too low for local GGUF; error message doesn't surface the fix #74204

Description

Environment

Observed behavior

Root cause

Inconsistency with built-in engine

Suggested fixes (pick one or combine)

Workaround

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions