Skip to content

Commit ae4806e

Browse files
dutifulbobmbelinkyclawsweeper[bot]osolmaz
authored
feat(plugins): add embedding provider contract (#84947)
Summary: - Merged feat(plugins): add embedding provider contract after ClawSweeper review. Automerge notes: - PR branch already contained follow-up commit before automerge: chore(plugins): refresh embedding provider sdk baseline - PR branch already contained follow-up commit before automerge: docs(plugins): document embedding provider contract - PR branch already contained follow-up commit before automerge: fix(plugins): restore embedding providers after snapshot loads - PR branch already contained follow-up commit before automerge: fix(plugins): resolve embedding providers from manifests - PR branch already contained follow-up commit before automerge: fix(plugin-sdk): keep embedding provider registry mutators internal - PR branch already contained follow-up commit before automerge: chore(plugin-sdk): refresh embedding provider API baseline Validation: - ClawSweeper review passed for head 41ebd66. - Required merge gates passed before the squash merge. Prepared head SHA: 41ebd66 Review: #84947 (comment) Co-authored-by: Bob <dutifulbob@gmail.com> Co-authored-by: Mariano Belinky <mbelinky@gmail.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: osolmaz Co-authored-by: osolmaz <2453968+osolmaz@users.noreply.github.com>
1 parent 0a4de3d commit ae4806e

47 files changed

Lines changed: 886 additions & 45 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Docs: https://docs.openclaw.ai
99
- Discord: allow configuring a bounded `agentComponents.ttlMs` callback registry lifetime for long-running component workflows, with per-account overrides and a 24-hour cap. (#84189) Thanks @100menotu001.
1010
- Plugin SDK: add row-level session workflow helpers and deprecate `loadSessionStore` so plugins can read and patch sessions without depending on the legacy whole-store shape. (#84693) Thanks @efpiva.
1111
- Gateway/plugins: reuse a compatible Gateway startup plugin registry during dispatch so safe plugin dispatches avoid redundant registry loading. (#84324) Thanks @ai-hpc.
12+
- Plugins/SDK: add a general `embeddingProviders` capability contract and registration API so embeddings can become a reusable provider surface outside memory-specific adapters.
1213
- Dependencies: refresh provider, plugin, UI, and tooling packages, update `protobufjs` to 8.4.0 to clear the current npm advisory, and carry the Claude ACP completion patch forward to `@agentclientprotocol/claude-agent-acp` 0.36.1.
1314
- Agents/tools: remove the old sender-owner tool gating path so configured tools stay visible for trusted sessions while command and channel-action auth still carry real sender identity.
1415
- QA-Lab: add curated mock JSONL replay fixtures and first-drift reporting for runtime-parity audits. (#80323, refs #80176) Thanks @100yenadmin.

docs/plugins/adding-capabilities.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ sidebarTitle: "Adding capabilities"
1515
load pipeline, runtime helpers), see [Plugin internals](/plugins/architecture).
1616
</Info>
1717

18-
Use this when OpenClaw needs a new shared domain such as image generation, video generation, or some future vendor-backed feature area.
18+
Use this when OpenClaw needs a new shared domain such as embeddings, image
19+
generation, video generation, or some future vendor-backed feature area.
1920

2021
The rule:
2122

@@ -113,6 +114,19 @@ The config key is intentionally separate from vision-analysis routing:
113114

114115
Keep those separate so fallback and policy remain explicit.
115116

117+
## Embedding providers
118+
119+
Use `embeddingProviders` for reusable vector embedding providers. This contract
120+
is intentionally broader than memory: tools, search, retrieval, importers, or
121+
future feature plugins can consume embeddings without depending on the memory
122+
engine.
123+
124+
For memory-engine-specific adapters, keep using `memoryEmbeddingProviders`.
125+
Those adapters own memory indexing details such as query/document split,
126+
runtime metadata, and local memory engine setup. Do not make a generic
127+
embedding provider depend on memory-owned modules unless the provider is only
128+
usable by memory.
129+
116130
## Review checklist
117131

118132
Before shipping a new capability, verify:

docs/plugins/architecture.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ Capabilities are the public **native plugin** model inside OpenClaw. Every nativ
3737
| ---------------------- | ------------------------------------------------ | ------------------------------------ |
3838
| Text inference | `api.registerProvider(...)` | `openai`, `anthropic` |
3939
| CLI inference backend | `api.registerCliBackend(...)` | `openai`, `anthropic` |
40+
| Embeddings | `api.registerEmbeddingProvider(...)` | Provider-owned vector plugins |
4041
| Speech | `api.registerSpeechProvider(...)` | `elevenlabs`, `microsoft` |
4142
| Realtime transcription | `api.registerRealtimeTranscriptionProvider(...)` | `openai` |
4243
| Realtime voice | `api.registerRealtimeVoiceProvider(...)` | `openai` |

docs/plugins/manifest.md

Lines changed: 49 additions & 41 deletions
Large diffs are not rendered by default.

docs/plugins/sdk-overview.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ methods:
9595
| `api.registerAgentHarness(...)` | Experimental low-level agent executor |
9696
| `api.registerCliBackend(...)` | Local CLI inference backend |
9797
| `api.registerChannel(...)` | Messaging channel |
98+
| `api.registerEmbeddingProvider(...)` | Reusable vector embedding provider |
9899
| `api.registerSpeechProvider(...)` | Text-to-speech / STT synthesis |
99100
| `api.registerRealtimeTranscriptionProvider(...)` | Streaming realtime transcription |
100101
| `api.registerRealtimeVoiceProvider(...)` | Duplex realtime voice sessions |
@@ -105,6 +106,12 @@ methods:
105106
| `api.registerWebFetchProvider(...)` | Web fetch / scrape provider |
106107
| `api.registerWebSearchProvider(...)` | Web search |
107108

109+
Embedding providers registered with `api.registerEmbeddingProvider(...)` must
110+
also be listed in `contracts.embeddingProviders` in the plugin manifest. This
111+
is the generic embedding surface for reusable vector generation. Memory-only
112+
adapters still use `api.registerMemoryEmbeddingProvider(...)` and
113+
`contracts.memoryEmbeddingProviders`.
114+
108115
### Tools and commands
109116

110117
Use [`defineToolPlugin`](/plugins/tool-plugins) for simple tool-only plugins

docs/plugins/sdk-provider-plugins.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -511,9 +511,9 @@ API key auth, and dynamic model resolution.
511511
<Step title="Add extra capabilities (optional)">
512512
### Step 5: Add extra capabilities
513513

514-
A provider plugin can register speech, realtime transcription, realtime
515-
voice, media understanding, image generation, video generation, web fetch,
516-
and web search alongside text inference. OpenClaw classifies this as a
514+
A provider plugin can register embeddings, speech, realtime transcription,
515+
realtime voice, media understanding, image generation, video generation,
516+
web fetch, and web search alongside text inference. OpenClaw classifies this as a
517517
**hybrid-capability** plugin - the recommended pattern for company plugins
518518
(one plugin per vendor). See
519519
[Internals: Capability Ownership](/plugins/architecture#capability-ownership-model).
@@ -655,6 +655,38 @@ API key auth, and dynamic model resolution.
655655
});
656656
```
657657
</Tab>
658+
<Tab title="Embeddings">
659+
```typescript
660+
api.registerEmbeddingProvider({
661+
id: "acme-ai",
662+
defaultModel: "acme-embed",
663+
transport: "remote",
664+
authProviderId: "acme-ai",
665+
create: async ({ model }) => ({
666+
provider: {
667+
id: "acme-ai",
668+
model,
669+
dimensions: 1536,
670+
embed: async (input) => {
671+
const text = typeof input === "string" ? input : input.text;
672+
return fetchAcmeEmbedding(text);
673+
},
674+
embedBatch: async (inputs) =>
675+
Promise.all(
676+
inputs.map((input) =>
677+
fetchAcmeEmbedding(typeof input === "string" ? input : input.text),
678+
),
679+
),
680+
},
681+
}),
682+
});
683+
```
684+
685+
Declare the same id in `contracts.embeddingProviders`. This is the
686+
general embedding contract for reusable vector generation. Use
687+
`registerMemoryEmbeddingProvider(...)` only for memory-engine-specific
688+
adapters.
689+
</Tab>
658690
<Tab title="Image and video generation">
659691
Video capabilities use a **mode-aware** shape: `generate`,
660692
`imageToVideo`, and `videoToVideo`. Flat aggregate fields like

docs/plugins/sdk-subpaths.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@ focused channel/runtime subpaths, `config-contracts`, `string-coerce-runtime`,
176176
| `plugin-sdk/provider-web-search-config-contract` | Narrow web-search config/credential helpers for providers that do not need plugin-enable wiring |
177177
| `plugin-sdk/provider-web-search-contract` | Narrow web-search config/credential contract helpers such as `createWebSearchProviderContractFields`, `enablePluginInConfig`, `resolveProviderWebSearchPluginConfig`, and scoped credential setters/getters |
178178
| `plugin-sdk/provider-web-search` | Web-search provider registration/cache/runtime helpers |
179+
| `plugin-sdk/embedding-providers` | General embedding provider types and read helpers, including `EmbeddingProviderAdapter`, `getEmbeddingProvider(...)`, and `listEmbeddingProviders(...)`; plugins register providers through `api.registerEmbeddingProvider(...)` so manifest ownership is enforced |
179180
| `plugin-sdk/provider-tools` | `ProviderToolCompatFamily`, `buildProviderToolCompatFamilyHooks`, and DeepSeek/Gemini/OpenAI schema cleanup + diagnostics |
180181
| `plugin-sdk/provider-usage` | `fetchClaudeUsage` and similar |
181182
| `plugin-sdk/provider-stream` | `ProviderStreamFamily`, `buildProviderStreamFamilyHooks`, `composeProviderStreamWrappers`, stream wrapper types, and shared Anthropic/Bedrock/DeepSeek V4/Google/Kilocode/Moonshot/OpenAI/OpenRouter/Z.A.I/MiniMax/Copilot wrapper helpers |

package.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -408,6 +408,10 @@
408408
"types": "./dist/plugin-sdk/media-mime.d.ts",
409409
"default": "./dist/plugin-sdk/media-mime.js"
410410
},
411+
"./plugin-sdk/embedding-providers": {
412+
"types": "./dist/plugin-sdk/embedding-providers.d.ts",
413+
"default": "./dist/plugin-sdk/embedding-providers.js"
414+
},
411415
"./plugin-sdk/media-generation-runtime": {
412416
"types": "./dist/plugin-sdk/media-generation-runtime.d.ts",
413417
"default": "./dist/plugin-sdk/media-generation-runtime.js"

scripts/lib/plugin-sdk-entrypoints.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@
7777
"media-runtime",
7878
"media-store",
7979
"media-mime",
80+
"embedding-providers",
8081
"media-generation-runtime",
8182
"conversation-binding-runtime",
8283
"conversation-runtime",

src/auto-reply/reply/commands-diagnostics.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ function createBundledPluginRecord(id: string): PluginRecord {
117117
channelIds: [],
118118
cliBackendIds: [],
119119
providerIds: [],
120+
embeddingProviderIds: [],
120121
speechProviderIds: [],
121122
realtimeTranscriptionProviderIds: [],
122123
realtimeVoiceProviderIds: [],

0 commit comments

Comments
 (0)