[Feature]: Use QMD MCP server mode for warm model caching

## Summary

When `memory.backend = "qmd"` is enabled, OpenClaw spawns a fresh `qmd query` process for each `memory_search` call. This causes **~19 second latency per search** because QMD must cold-load three GGUF models on every invocation:

- embeddinggemma-300M (query embedding)
- qwen3-reranker-0.6b (reranking)
- Qwen3-0.6B (query expansion/HyDE)

## Observed Behavior

```
$ time qmd query "test query" --limit 3
...
qmd query "test query" --limit 3  6.82s user 1.15s system 41% cpu 19.095 total
```

Only 41% CPU utilization — most time is spent loading models from disk, not computing.

By contrast, `qmd vsearch` (vector-only, 1 model) takes ~3s, and `qmd search` (BM25, no models) takes ~0.2s.

## Proposed Solution

QMD already exposes an MCP server (`qmd mcp`) that keeps models warm between queries. Instead of spawning fresh `qmd query` processes, OpenClaw could:

1. Start `qmd mcp` as a persistent sidecar (similar to how other MCP servers are managed)
2. Send search requests via MCP protocol instead of CLI spawn
3. Models stay loaded → queries drop from ~19s to ~2-3s

The QMD MCP server exposes these tools:
- `qmd_search` - BM25 keyword search
- `qmd_vsearch` - Vector semantic search  
- `qmd_query` - Hybrid search with reranking
- `qmd_get` - Document retrieval

## Alternative: Add vsearch mode option

A simpler alternative would be adding a config option like `memory.qmd.searchMode: "vsearch"` to use vector-only search (~3s) instead of the full query pipeline (~19s). This trades some quality (no reranking/query expansion) for 6x speed improvement.

## References

- QMD repo: https://github.com/tobi/qmd
- OpenClaw QMD skill notes: "consider keeping the process/model warm (e.g., a long-lived qmd/MCP server mode)"
- PR #3160 docs mention "First search may be slow" but every search is slow due to per-query process spawning

## Environment

- OpenClaw 2026.2.2
- QMD (latest from tobi/qmd)
- macOS (Apple Silicon)
- Config: `memory.backend: "qmd"`, `memory.qmd.limits.timeoutMs: 20000`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Use QMD MCP server mode for warm model caching #9048

Summary

Observed Behavior

Proposed Solution

Alternative: Add vsearch mode option

References

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Use QMD MCP server mode for warm model caching #9048

Description

Summary

Observed Behavior

Proposed Solution

Alternative: Add vsearch mode option

References

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions