Summary
active-memory's embedded recall path can intermittently run memory_search with no visible memory embedding provider, even though the Gateway immediately reports the configured OpenAI memory embedding provider as loaded, active, and healthy.
When this happens, memory-core logs:
[memory] search: embeddings unavailable; using keyword-only results: Cannot embed query in FTS-only mode (no embedding provider)
The user-facing turn then shows an active-memory timeout, for example:
Active Memory: status=timeout elapsed=30.0s query=recent
This is related to, but not the same as, #89651 / #89652. That issue/PR addresses the startup gap where agents.defaults.memorySearch.provider = "openai" did not load the plugin owning contracts.memoryEmbeddingProviders: ["openai"]. The behavior here still appears after the OpenAI plugin is loaded and health checks are green, so it looks like a runtime provider hydration / provider registry visibility race in the embedded active-memory search path.
Why this is bad
active-memory is a blocking before_prompt_build hook. A single providerless memory_search can delay normal replies for the full active-memory watchdog window.
In the observed setup:
active-memory.timeoutMs = 30000
active-memory.circuitBreakerMaxTimeouts = 3
active-memory.circuitBreakerCooldownMs = 180000
So the Gateway can repeatedly spend ~30 seconds on hidden recall before skipping future active-memory attempts.
Also, timeout_partial is not triggered in this case because the hidden active-memory subagent has not produced any assistant summary text yet. It is still waiting on memory_search, so the timeout result has no partial answer to recover.
Environment / config shape
Version:
OpenClaw 2026.5.28 (e932160)
Relevant config:
Immediately after the failure, openclaw memory status --deep --json reported the main agent memory backend as healthy:
{
"provider": "openai",
"model": "text-embedding-3-large",
"requestedProvider": "openai",
"sources": ["memory", "sessions"],
"fts": { "enabled": true, "available": true },
"vector": {
"enabled": true,
"storeAvailable": true,
"semanticAvailable": true,
"available": true,
"dims": 3072
},
"custom": {
"searchMode": "hybrid",
"providerState": { "mode": "active", "providerId": "openai" }
},
"embeddingProbe": { "ok": true }
}
And openclaw plugins inspect openai --json reported the OpenAI plugin as loaded/activated and advertising/registering the memory embedding contract:
{
"plugin": {
"id": "openai",
"enabled": true,
"activated": true,
"status": "loaded",
"memoryEmbeddingProviderIds": ["openai"],
"contracts": {
"memoryEmbeddingProviders": ["openai"]
}
}
}
Observed behavior
Timeline from a real run:
active-memory begins an embedded recall for queryMode=recent.
- The hidden recall subagent calls
memory_search.
- memory-core tries to run hybrid search, loads keyword candidates, then attempts to embed the query.
- In this embedded path,
this.provider is null, so embedQueryWithRetry() throws:
Cannot embed query in FTS-only mode (no embedding provider)
- The search path catches that and logs keyword-only fallback:
memory search: embeddings unavailable; using keyword-only results: Cannot embed query in FTS-only mode (no embedding provider)
- The hidden recall does not produce assistant summary text before the active-memory watchdog fires.
- The user-facing turn gets only:
Active Memory: status=timeout elapsed=30.0s query=recent
- Health checks immediately after show OpenAI memory embeddings active and probeable.
Source pointers
The failure string is thrown when the memory manager has no provider handle:
// extensions/memory-core/src/memory/manager-embedding-ops.ts
protected async embedQueryWithRetry(text: string): Promise<number[]> {
const provider = this.provider;
if (!provider) {
throw new Error("Cannot embed query in FTS-only mode (no embedding provider)");
}
...
}
The hybrid search path catches that and deliberately falls back to keyword-only results if FTS is available:
// extensions/memory-core/src/memory/manager.ts
try {
queryVec = await this.embedQueryWithRetry(cleaned);
} catch (err) {
...
if (activatedFallback) {
...
} else if (!this.provider && this.fts.enabled && this.fts.available) {
log.warn(`memory search: embeddings unavailable; using keyword-only results: ${message}`);
return this.selectScoredResults(keywordResults, maxResults, minScore, 0);
} else {
throw err;
}
}
That fallback is reasonable when the runtime is genuinely FTS-only. The bug is that this path is being entered while the configured provider is openai and the Gateway reports that provider as loaded/active/healthy before and after the failed active-memory turn.
active-memory then races the recall subagent against its watchdog:
// extensions/active-memory/index.ts
const controller = new AbortController();
const timeoutId = setTimeout(() => {
controller.abort(new Error(`active-memory timeout after ${watchdogTimeoutMs}ms`));
}, watchdogTimeoutMs);
...
const raceResult = await Promise.race([
subagentPromise,
timeoutPromise,
terminalMemorySearchWatch.promise,
]);
If no assistant text has been written yet, the timeout result is plain status: "timeout", not timeout_partial:
// extensions/active-memory/index.ts
const summary = truncateSummary(normalizeActiveSummary(rawReply ?? "") ?? "", params.maxSummaryChars);
if (summary.length === 0) {
return { status: "timeout", elapsedMs: params.elapsedMs, summary: null, searchDebug };
}
return { status: "timeout_partial", elapsedMs: params.elapsedMs, summary, searchDebug };
Expected behavior
If agents.defaults.memorySearch.provider = "openai" and the OpenAI plugin is loaded/activated with a registered memoryEmbeddingProviderIds: ["openai"], every memory_search invocation from active-memory should see the same provider registry and use vector/hybrid search.
If the configured provider is temporarily not visible, the system should fail fast with a structured provider-unavailable result and enough diagnostics to identify the runtime/provider scope, rather than silently doing a slow FTS-only fallback inside a blocking pre-prompt hook.
Actual behavior
One embedded active-memory recall can see this.provider === null and enter FTS-only/keyword-only fallback, while separate live checks show:
provider=openai
model=text-embedding-3-large
providerState=active
- vector semantic search available
- embedding probe OK
- OpenAI plugin loaded and activated
memoryEmbeddingProviderIds=["openai"]
Suspected root cause
Likely a provider hydration / runtime registry visibility race specific to embedded active-memory memory tool execution. Possibilities:
- the memory search manager instance is created before plugin-owned memory embedding providers are visible and is later reused in providerless state;
- the embedded recall runtime has a different runtime config / provider registry snapshot than the main Gateway runtime;
active-memory's hidden subagent/tool context gets memory_search before capability provider hydration completes;
experimental.sessionMemory=true / sources=["memory", "sessions"] increases the cost of the FTS-only fallback enough that the active-memory watchdog consistently wins the race;
- parent abort propagation does not fully cancel or drain the in-flight
memory_search, so the child operation can keep running until its own tool watchdog.
Suggested fix shape
-
Add regression coverage for active-memory invoking memory_search with:
agents.defaults.memorySearch.provider = "openai"
- plugin-owned
memoryEmbeddingProviders: ["openai"]
sources=["memory", "sessions"]
experimental.sessionMemory=true
- the OpenAI memory embedding provider already loaded/registered.
-
Ensure the memory manager used by dynamic/embedded tool calls rehydrates or rechecks the configured provider before accepting FTS-only fallback when the configured provider is non-local and non-none.
-
Add diagnostics on the FTS-only fallback branch:
- requested provider id
- available registered memory embedding provider ids
- plugin ids loaded in the active runtime
- agent id / session key / embedded-active-memory marker
- whether the manager was reused from cache and when its provider was resolved.
-
Treat provider-null fallback differently from normal zero-hit FTS results in active-memory. A configured-provider-missing result should become a structured memory_search unavailable result quickly, so the active-memory hook can fast-fail instead of waiting for the full watchdog.
-
Make active-memory timeout abort propagate down to in-flight memory_search and embedding/query work, or ensure late child work is cancelled/drained immediately after the parent timeout.
Related work
This issue is the follow-up runtime case: the provider can be loaded and healthy, but an embedded active-memory memory_search call still sometimes observes no provider and drops into FTS-only mode.
Summary
active-memory's embedded recall path can intermittently runmemory_searchwith no visible memory embedding provider, even though the Gateway immediately reports the configured OpenAI memory embedding provider as loaded, active, and healthy.When this happens, memory-core logs:
The user-facing turn then shows an active-memory timeout, for example:
This is related to, but not the same as, #89651 / #89652. That issue/PR addresses the startup gap where
agents.defaults.memorySearch.provider = "openai"did not load the plugin owningcontracts.memoryEmbeddingProviders: ["openai"]. The behavior here still appears after the OpenAI plugin is loaded and health checks are green, so it looks like a runtime provider hydration / provider registry visibility race in the embedded active-memory search path.Why this is bad
active-memoryis a blockingbefore_prompt_buildhook. A single providerlessmemory_searchcan delay normal replies for the full active-memory watchdog window.In the observed setup:
active-memory.timeoutMs = 30000active-memory.circuitBreakerMaxTimeouts = 3active-memory.circuitBreakerCooldownMs = 180000So the Gateway can repeatedly spend ~30 seconds on hidden recall before skipping future active-memory attempts.
Also,
timeout_partialis not triggered in this case because the hidden active-memory subagent has not produced any assistant summary text yet. It is still waiting onmemory_search, so the timeout result has no partial answer to recover.Environment / config shape
Version:
Relevant config:
{ "plugins": { "entries": { "active-memory": { "enabled": true, "config": { "enabled": true, "agents": ["main"], "allowedChatTypes": ["direct"], "queryMode": "recent", "promptStyle": "balanced", "timeoutMs": 30000, "maxSummaryChars": 800, "recentUserTurns": 3, "recentAssistantTurns": 2, "cacheTtlMs": 30000, "circuitBreakerMaxTimeouts": 3, "circuitBreakerCooldownMs": 180000, "model": "anthropic/claude-sonnet-4-6" } }, "memory-core": { "enabled": true }, "openai": { "enabled": true } } }, "agents": { "defaults": { "memorySearch": { "enabled": true, "sources": ["memory", "sessions"], "experimental": { "sessionMemory": true }, "provider": "openai", "model": "text-embedding-3-large", "query": { "hybrid": { "enabled": true, "vectorWeight": 0.7, "textWeight": 0.3, "candidateMultiplier": 4, "mmr": { "enabled": true, "lambda": 0.7 } } } } } } }Immediately after the failure,
openclaw memory status --deep --jsonreported the main agent memory backend as healthy:{ "provider": "openai", "model": "text-embedding-3-large", "requestedProvider": "openai", "sources": ["memory", "sessions"], "fts": { "enabled": true, "available": true }, "vector": { "enabled": true, "storeAvailable": true, "semanticAvailable": true, "available": true, "dims": 3072 }, "custom": { "searchMode": "hybrid", "providerState": { "mode": "active", "providerId": "openai" } }, "embeddingProbe": { "ok": true } }And
openclaw plugins inspect openai --jsonreported the OpenAI plugin as loaded/activated and advertising/registering the memory embedding contract:{ "plugin": { "id": "openai", "enabled": true, "activated": true, "status": "loaded", "memoryEmbeddingProviderIds": ["openai"], "contracts": { "memoryEmbeddingProviders": ["openai"] } } }Observed behavior
Timeline from a real run:
active-memorybegins an embedded recall forqueryMode=recent.memory_search.this.providerisnull, soembedQueryWithRetry()throws:Source pointers
The failure string is thrown when the memory manager has no provider handle:
The hybrid search path catches that and deliberately falls back to keyword-only results if FTS is available:
That fallback is reasonable when the runtime is genuinely FTS-only. The bug is that this path is being entered while the configured provider is
openaiand the Gateway reports that provider as loaded/active/healthy before and after the failed active-memory turn.active-memorythen races the recall subagent against its watchdog:If no assistant text has been written yet, the timeout result is plain
status: "timeout", nottimeout_partial:Expected behavior
If
agents.defaults.memorySearch.provider = "openai"and the OpenAI plugin is loaded/activated with a registeredmemoryEmbeddingProviderIds: ["openai"], everymemory_searchinvocation from active-memory should see the same provider registry and use vector/hybrid search.If the configured provider is temporarily not visible, the system should fail fast with a structured provider-unavailable result and enough diagnostics to identify the runtime/provider scope, rather than silently doing a slow FTS-only fallback inside a blocking pre-prompt hook.
Actual behavior
One embedded
active-memoryrecall can seethis.provider === nulland enter FTS-only/keyword-only fallback, while separate live checks show:provider=openaimodel=text-embedding-3-largeproviderState=activememoryEmbeddingProviderIds=["openai"]Suspected root cause
Likely a provider hydration / runtime registry visibility race specific to embedded active-memory memory tool execution. Possibilities:
active-memory's hidden subagent/tool context getsmemory_searchbefore capability provider hydration completes;experimental.sessionMemory=true/sources=["memory", "sessions"]increases the cost of the FTS-only fallback enough that the active-memory watchdog consistently wins the race;memory_search, so the child operation can keep running until its own tool watchdog.Suggested fix shape
Add regression coverage for
active-memoryinvokingmemory_searchwith:agents.defaults.memorySearch.provider = "openai"memoryEmbeddingProviders: ["openai"]sources=["memory", "sessions"]experimental.sessionMemory=trueEnsure the memory manager used by dynamic/embedded tool calls rehydrates or rechecks the configured provider before accepting FTS-only fallback when the configured provider is non-local and non-
none.Add diagnostics on the FTS-only fallback branch:
Treat provider-null fallback differently from normal zero-hit FTS results in
active-memory. A configured-provider-missing result should become a structuredmemory_searchunavailable result quickly, so the active-memory hook can fast-fail instead of waiting for the full watchdog.Make active-memory timeout abort propagate down to in-flight
memory_searchand embedding/query work, or ensure late child work is cancelled/drained immediately after the parent timeout.Related work
This issue is the follow-up runtime case: the provider can be loaded and healthy, but an embedded active-memory
memory_searchcall still sometimes observes no provider and drops into FTS-only mode.