Skip to content

fix(plugins): load owning plugin for configured memory embedding provider at startup#89652

Merged
osolmaz merged 18 commits into
openclaw:mainfrom
joeykrug:fix/memory-embedding-provider-startup-loading
Jun 6, 2026
Merged

fix(plugins): load owning plugin for configured memory embedding provider at startup#89652
osolmaz merged 18 commits into
openclaw:mainfrom
joeykrug:fix/memory-embedding-provider-startup-loading

Conversation

@joeykrug

@joeykrug joeykrug commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Gateway startup planning now loads the plugin that owns a configured memory embedding provider (agents.*.memorySearch.provider), matched against contracts.memoryEmbeddingProviders — the same capability-specific startup loading already done for speech, web search, model, image, and voice providers.
  • Previously, setting agents.defaults.memorySearch.provider = "openai" did not pull the OpenAI plugin into the startup plan, so active-memory could start with no registered memory embedding provider and silently drop to keyword/FTS-only recall.
  • Adds a Gateway-startup health warning when a configured memorySearch.provider is not registered by any loaded plugin, so the silent fallback is diagnosable from startup logs.
  • Out of scope: changing default behavior when memorySearch.provider is unset (only explicitly-configured concrete providers are collected; auto/local/none sentinels and custom models.providers.<id> openai-compatible ids are intentionally excluded).

Linked context

Closes #89651

Related #

Was this requested by a maintainer or owner? This is a contributor bug fix opened from issue #89651.

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: agents.defaults.memorySearch.provider = "openai" did not load the owning plugin at Gateway startup, so memory embeddings were never registered and recall fell back to keyword/FTS-only.
  • Real environment tested: local OpenClaw Gateway setup with agents.defaults.memorySearch.provider = "openai", active-memory enabled, and OpenAI configured as the memory embedding provider.
  • Exact steps or command run after this patch: ran live OpenClaw runtime inspection and memory recall commands after loading the OpenAI provider at startup: openclaw plugins inspect openai --json, openclaw plugins inspect active-memory --json, and a direct memory_search query.
  • Evidence after fix: copied live terminal output from the local OpenClaw runtime:
$ openclaw plugins inspect openai --json | jq '.plugin.status,.plugin.activated,.plugin.memoryEmbeddingProviderIds'
"loaded"
true
[
  "openai"
]

$ openclaw plugins inspect active-memory --json | jq '.plugin.status,.plugin.activated'
"loaded"
true

$ memory_search(query="active memory timeout check provider openai", corpus="all", maxResults=8)
provider=openai
model=text-embedding-3-large
searchMs=1625
hits=8
topHit.vectorScore=0.541...
  • Observed result after fix: the OpenAI plugin is loaded and activated, it registers memoryEmbeddingProviderIds=["openai"], active-memory is loaded and activated, and live memory recall returns vector results through provider=openai with model=text-embedding-3-large instead of keyword-only fallback.
  • What was not tested: CI does not boot Joey's live Gateway with real OpenAI credentials; it validates the startup planning and provider-registration seam directly.
  • Proof limitations or environment constraints: The live output above is from the local OpenClaw runtime where the startup loading issue was reproduced and mitigated; the PR codifies that startup planning path in upstream tests.

Tests and validation

Which commands did you run?

  • pnpm test src/plugins/channel-plugin-ids.test.ts (131 passed)
  • pnpm test src/gateway/server-startup-plugins.test.ts (8 passed)
  • pnpm test src/plugins/bundled-plugin-metadata.test.ts src/plugins/effective-plugin-ids.test.ts (40 passed)
  • pnpm test src/gateway/server-plugins.test.ts src/plugins/loader-records.test.ts (48 passed)
  • pnpm tsgo (clean), oxfmt --check on touched files (clean), pnpm build (clean, no [INEFFECTIVE_DYNAMIC_IMPORT])
  • CI follow-up: pnpm check:test-types (clean)
  • CI follow-up: pnpm deadcode:unused-files (clean; allowlist matched 36 intentional entries)
  • CI follow-up: pnpm format:check src/gateway/server-startup-plugins.test.ts scripts/deadcode-unused-files.allowlist.mjs (clean)

What regression coverage was added or updated?

  • src/plugins/channel-plugin-ids.test.ts: owning plugin included for a configured (and per-agent) memory embedding provider; sentinel/disabled providers ignored; disabled and denied owners excluded; restrictive-allowlist metadata scope keeps the owner. Fixture updated so the OpenAI plugin declares contracts.memoryEmbeddingProviders: ["openai"] (matching the real manifest).
  • src/gateway/server-startup-plugins.test.ts: warnUnregisteredConfiguredMemoryEmbeddingProviders warns for an unregistered configured provider, stays quiet when registered, skips custom models.providers ids, and skips sentinel/disabled providers.

What failed before this fix, if known?

  • Before the fix, a config with only memorySearch.provider=openai produced a startup plan without openai, so no memory embedding provider was registered.

Risk checklist

Did user-visible behavior change? Yes — a configured memory-embedding provider's plugin is now loaded at startup; semantic recall works instead of silently degrading.

Did config, environment, or migration behavior change? No new config surface; this reads existing agents.*.memorySearch.provider.

Did security, auth, secrets, network, or tool execution behavior change? No.

What is the highest-risk area? Pulling an additional plugin into the Gateway startup set. It is mitigated by routing through the same allow/deny/enabled/activation policy used by the neighboring canStartConfigured*ProviderPlugin functions (fail-closed for denied/disabled), only collecting explicitly-configured concrete provider ids, and excluding sentinels and custom models.providers ids.

How is that risk mitigated? Capability-specific collection mirrors the existing voice/generation provider startup paths; both the authoritative plan path (resolveGatewayStartupPluginPlanFromRegistry) and the metadata-scope fast path (resolveGatewayStartupMetadataPluginIds) are covered so the scoped snapshot cannot under-scope the owning plugin.

Current review state

What is the next action? Maintainer review after CI rerun.

What is still waiting on author, maintainer, CI, or external proof? CI rerun on commit d723a1f6df.

Which bot or reviewer comments were addressed? None yet.

…ider at startup

Gateway startup planning now matches agents.*.memorySearch.provider against
plugin contracts.memoryEmbeddingProviders and includes the owning plugin, the
same capability-specific startup loading already done for speech/web/model/
image/voice providers. Without this, memorySearch.provider="openai" did not
load the OpenAI plugin, so active-memory started with no registered memory
embedding provider and silently dropped to keyword/FTS-only recall.

Also warns at Gateway startup when a configured memory embedding provider is
not registered by any loaded plugin.

Closes openclaw#89651

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: M triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 3, 2026
@clawsweeper

clawsweeper Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 6, 2026, 8:09 AM ET / 12:09 UTC.

Summary
The branch updates Gateway startup plugin planning and diagnostics so explicit agents.*.memorySearch.provider and fallback values can load the plugin that owns matching memory embedding contracts, with tests for direct, custom, generic, disabled, and warning cases.

PR surface: Source +348, Tests +594. Total +942 across 9 files.

Reproducibility: yes. from source inspection, though I did not run a live Gateway repro in this read-only review. Current main collects sibling provider families for startup but omits memory embedding providers, matching the reported memorySearch.provider="openai" startup gap.

Review metrics: 1 noteworthy metric.

  • Startup capability added: 1 existing config capability added. agents.*.memorySearch.provider and fallback now participate in Gateway startup owner loading, which changes startup behavior for existing configs.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Maintainer should explicitly accept the earlier startup activation behavior for existing explicit memory provider configs before merge.

Risk before merge

Maintainer options:

  1. Accept explicit-provider startup activation (recommended)
    Land once maintainers accept that explicit memory provider configs should activate the owning plugin at Gateway startup instead of silently degrading to FTS-only recall.
  2. Keep runtime fail-fast separate
    Do not expand this branch into runtime provider-loss behavior; leave that separately reviewable in fix(memory): fail fast when embeddings provider is unavailable #90336.
  3. Pause for upgrade proof
    Pause if maintainers want additional upgrade proof for setups where OpenAI, Ollama, Gemini, or similar provider plugins were not previously started by memory config alone.

Next step before merge

  • [P2] No narrow automated repair remains; maintainers need to accept the startup compatibility change and keep the runtime fail-fast work separate.

Security
Cleared: The diff does not change workflows, dependencies, lockfiles, secrets handling, package scripts, or external code execution paths.

Review details

Best possible solution:

Land the startup-planning fix if maintainers accept earlier provider-plugin activation for explicit memory provider configs, while keeping the runtime fail-fast/race work in #90336.

Do we have a high-confidence way to reproduce the issue?

Yes from source inspection, though I did not run a live Gateway repro in this read-only review. Current main collects sibling provider families for startup but omits memory embedding providers, matching the reported memorySearch.provider="openai" startup gap.

Is this the best way to solve the issue?

Yes, this appears to be the right layer for the startup part: Gateway planning already owns manifest-based provider startup, and the PR mirrors sibling capability paths plus adds diagnostics. The runtime fail-fast/provider visibility problem should remain separate in the linked runtime work.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 0a08625d795f.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied live terminal output from a local Gateway showing OpenAI and active-memory loaded and vector memory recall using provider=openai after the patch.

Label justifications:

  • P2: This is a normal-priority provider startup bug fix with limited blast radius but real memory recall impact.
  • merge-risk: 🚨 compatibility: The PR can intentionally change upgrade/startup behavior by loading configured memory provider plugins earlier than before.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes copied live terminal output from a local Gateway showing OpenAI and active-memory loaded and vector memory recall using provider=openai after the patch.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes copied live terminal output from a local Gateway showing OpenAI and active-memory loaded and vector memory recall using provider=openai after the patch.
Evidence reviewed

PR surface:

Source +348, Tests +594. Total +942 across 9 files.

View PR surface stats
Area Files Added Removed Net
Source 7 408 60 +348
Tests 2 594 0 +594
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 9 1002 60 +942

What I checked:

  • Repository policy read: Root AGENTS.md and scoped src/plugins/AGENTS.md plus src/gateway/AGENTS.md were read; their plugin metadata, startup, compatibility, and review-depth guidance applies to this PR. (AGENTS.md:1, 0a08625d795f)
  • Current main omits memory embedding providers from startup collection: On current main, collectConfiguredProviderIds includes speech, web search, generation, and voice provider ids but no memory embedding provider ids, so the linked startup gap remains necessary to fix. (src/plugins/gateway-startup-plugin-ids.ts:579, 0a08625d795f)
  • PR adds memory provider owner resolution: The PR adds collectConfiguredMemoryEmbeddingStartupProviderOwners, resolving primary/fallback memory providers and custom models.providers.<id>.api owners before adding them to startup provider ids. (src/plugins/gateway-startup-plugin-ids.ts:607, b7648ca5b362)
  • PR adds post-load diagnostics: The PR warns only after the full startup runtime load when configured memory embedding providers remain unregistered, avoiding setup-runtime pre-bind false positives. (src/gateway/server-startup-plugins.ts:193, b7648ca5b362)
  • Runtime memory config contract checked: Current memory runtime defaults auto/missing provider to openai, inherits per-agent overrides from defaults, and resolves custom provider ids through models.providers.<id>.api, matching the PR's startup-planning shape. (src/agents/memory-search.ts:190, 0a08625d795f)
  • Generic embedding bridge checked: Memory-core can adapt generic embeddingProviders into memory embedding adapters, so including generic embedding contract owners in startup planning is consistent with current runtime behavior. (extensions/memory-core/src/memory/embeddings.ts:113, 0a08625d795f)

Likely related people:

  • vincentkoc: Current main blame for the central startup provider collection and memory provider runtime paths points to d4b4a658094e, which recently touched provider preservation across these files. (role: recent area contributor; confidence: medium; commits: d4b4a658094e; files: src/plugins/gateway-startup-plugin-ids.ts, src/agents/memory-search.ts, extensions/memory-core/src/memory/embeddings.ts)
  • steipete: Blame on loadGatewayStartupPluginRuntime shows prior Gateway startup runtime boundary work in May 2026, which is the caller now receiving the new warning hook. (role: gateway startup area contributor; confidence: medium; commits: 250376f88577, 85beee613c64, 0ee52e940547; files: src/gateway/server-startup-plugins.ts)
  • osolmaz: The PR branch history and the related open runtime fail-fast PR show substantial recent work on this same memory provider startup/runtime boundary, even though this is not current-main provenance. (role: adjacent follow-up owner; confidence: medium; commits: 66da521fcdc7, 06d98b022184, 15aca4242c33; files: src/plugins/gateway-startup-plugin-ids.ts, src/gateway/server-startup-plugins.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. labels Jun 3, 2026
@openclaw-barnacle openclaw-barnacle Bot added scripts Repository scripts proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 3, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 3, 2026
CHANGELOG.md is release-owned; PR release-note context belongs in the PR body
and commit messages. Per ClawSweeper review, remove the changelog line this PR
added so release generation owns CHANGELOG.md. Behavior context stays in the
fix commit body and PR description.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 3, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 3, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 3, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. merge-risk: 🚨 automation 🚨 May affect CI, automerge, proof capture, label sync, or maintainer automation. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 3, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 4, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@openclaw-barnacle openclaw-barnacle Bot added size: XL and removed size: L proof: sufficient ClawSweeper judged the real behavior proof convincing. labels Jun 6, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 6, 2026
@osolmaz

osolmaz commented Jun 6, 2026

Copy link
Copy Markdown
Member

Implementation-loop closeout for b7648ca.

Summary:

  • Refactored memory embedding startup ownership around config-only provider owner resolution.
  • Gateway startup now loads owners for explicit memorySearch.provider and memorySearch.fallback, including custom models.providers API owner ids and generic embedding provider contracts.
  • Startup planning skips auto/local/none, disabled memory search, disabled memory slot, and setup-runtime pre-bind warning paths.
  • Unregistered provider diagnostics now run after the full startup runtime load.

Validation:

  • pnpm lint --threads=8: pass
  • node scripts/run-vitest.mjs src/plugins/channel-plugin-ids.test.ts src/gateway/server-startup-plugins.test.ts src/plugins/embedding-provider-runtime.test.ts src/plugins/memory-embedding-provider-runtime.test.ts: pass, 193 tests across 2 Vitest shards
  • pnpm tsgo:core: pass
  • pnpm tsgo:core:test: pass
  • git diff --check: pass
  • codex review --base main: pass, no actionable correctness issues / no P0 or P1 findings
  • Earlier topology proof after the runtime-import refactor: node --import tsx scripts/check-import-cycles.ts and node --import tsx scripts/check-madge-import-cycles.ts both passed

CI:

  • Head: b7648ca
  • Relevant CI checks for this PR are passing, including check-lint, check-prod-types, check-test-types, check-guards, runtime topology, extension bundled/package-boundary, docs, and selected startup/gateway/agentic shards.
  • Known unrelated failure: check-additional-boundaries-bcd fails lint:plugins:no-extension-test-core-imports on existing extension live tests, extensions/google/google.live.test.ts and extensions/minimax/minimax.live.test.ts. This PR does not touch those files or extension live-test imports.

Remaining maintainer decision:

@osolmaz osolmaz self-assigned this Jun 6, 2026
@osolmaz

osolmaz commented Jun 6, 2026

Copy link
Copy Markdown
Member

Maintainer land-ready note for b7648ca.

I am accepting the product behavior tradeoff here: explicit memorySearch.provider and memorySearch.fallback configs should activate the owning provider plugin during Gateway startup, rather than silently allowing semantic memory to degrade when the plugin was not loaded.

Validation reviewed before merge:

  • node scripts/run-vitest.mjs src/plugins/channel-plugin-ids.test.ts src/gateway/server-startup-plugins.test.ts src/plugins/embedding-provider-runtime.test.ts src/plugins/memory-embedding-provider-runtime.test.ts: pass, 193 tests across 2 shards
  • pnpm lint --threads=8: pass
  • pnpm tsgo:core: pass
  • pnpm tsgo:core:test: pass
  • git diff --check: pass
  • codex review --base main: no actionable correctness issues / no P0 or P1
  • import-cycle and madge topology checks passed after the startup-runtime import refactor

CI reviewed before merge:

  • No required checks are configured for this branch.
  • Relevant PR checks are passing, including check-lint, check-prod-types, check-test-types, check-guards, runtime topology, bundled/package-boundary, docs, memory/provider critical quality, and selected startup/gateway/agentic shards.
  • Known unrelated baseline failures accepted for this merge: build-artifacts and check-additional-boundaries-bcd both fail on existing extension live-test boundary offenders, extensions/google/google.live.test.ts and extensions/minimax/minimax.live.test.ts. This PR does not touch those files.

Related runtime work:

@osolmaz osolmaz merged commit daab68e into openclaw:main Jun 6, 2026
167 of 170 checks passed
@osolmaz

osolmaz commented Jun 6, 2026

Copy link
Copy Markdown
Member

Merged by direct manual squash merge.

Proof used for the merge is recorded in the land-ready note above. The only accepted CI exceptions were unrelated baseline failures in build-artifacts and check-additional-boundaries-bcd, both from existing extension live-test boundary offenders in extensions/google/google.live.test.ts and extensions/minimax/minimax.live.test.ts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: XL status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway startup does not load the plugin owning a configured memory embedding provider (memorySearch.provider)

2 participants