fix(memory): move local llama.cpp runtime to provider plugin#91324
Conversation
|
Codex review: found issues before merge. Reviewed June 9, 2026, 1:53 AM ET / 05:53 UTC. Summary PR surface: Source +1945, Tests +238, Docs +89, Config +42, Other +774. Total +3088 across 39 files. Reproducibility: yes. The linked issue and Crabbox smoke describe a package update where Review metrics: 3 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Merge the ownership change only after refreshing the public SDK API baseline, obtaining explicit dependency/security approval for the current head, and validating the provider package install/load path across supported native platforms. Do we have a high-confidence way to reproduce the issue? Yes. The linked issue and Crabbox smoke describe a package update where Is this the best way to solve the issue? Yes, the plugin-owned native dependency is the best ownership direction for this bug under the repo's plugin dependency policy. The current branch still needs API-baseline and security/release gates before it is the best mergeable form. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against e1978cf73cbd. Label changesLabel justifications:
Evidence reviewedPR surface: Source +1945, Tests +238, Docs +89, Config +42, Other +774. Total +3088 across 39 files. View PR surface stats
Security concerns:
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Verification update for this draft:
Note: npm ClawSweeper re-review is queued/in progress after the PR body update. |
|
Live issue #88705 smoke passed in Crabbox. Provider/id: Azure Crabbox What this proved:
This covers the reported issue state where local memory is configured but the OpenClaw root/global install no longer has |
Dependency GuardThis PR changes dependency-related files. Maintainers should confirm these changes are intentional. Changed files:
Maintainer follow-up:
|
Dependency graph change authorizedThis PR includes dependency graph changes. A repository admin or member of
A later push changes the PR head SHA and requires a fresh security approval. |
|
ready for review, tested live on crabbox that it should solve the issue maybe minor nitpicks remain, like plugin/extension/folder naming ( |
7978447 to
992319e
Compare
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub. |
|
Warning Review the following alerts detected in dependencies. According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
/allow-dependencies-change |
|
Merge note before squash merge:
Proceeding with direct pinned squash merge as requested. |
|
Merged with direct pinned squash merge.
|
…26.6.6) (#1040) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.5` → `2026.6.6` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.6.6`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202666) [Compare Source](openclaw/openclaw@v2026.6.5...v2026.6.6) ##### Highlights - Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender checks, deleted-agent ACP bypasses, loopback tools, Discord moderation, and Teams group actions; exec approvals now fail closed on timeout. ([#​91529](openclaw/openclaw#91529), [#​91618](openclaw/openclaw#91618), [#​91615](openclaw/openclaw#91615), [#​91619](openclaw/openclaw#91619), [#​91741](openclaw/openclaw#91741), [#​91745](openclaw/openclaw#91745), [#​91746](openclaw/openclaw#91746), [#​91748](openclaw/openclaw#91748), [#​91749](openclaw/openclaw#91749), [#​91750](openclaw/openclaw#91750), [#​91751](openclaw/openclaw#91751), [#​91752](openclaw/openclaw#91752), [#​91763](openclaw/openclaw#91763), [#​89938](openclaw/openclaw#89938)) Thanks [@​joshavant](https://github.com/joshavant), [@​pgondhi987](https://github.com/pgondhi987), [@​mmaps](https://github.com/mmaps), [@​eleqtrizit](https://github.com/eleqtrizit), [@​shakkernerd](https://github.com/shakkernerd), and [@​drobison00](https://github.com/drobison00). - Telegram delivery is safer and more coherent: account-scoped topics route to the right agent, streamed text survives tool calls, `/compact` works on generic ingress, callback handling uses concrete APIs, draft chunking is shared, durable dispatch dedupe moved into the SDK, and unauthorized DM text stays out of cache and prompt context. ([#​91189](openclaw/openclaw#91189), [#​88682](openclaw/openclaw#88682), [#​89588](openclaw/openclaw#89588), [#​90212](openclaw/openclaw#90212), [#​91876](openclaw/openclaw#91876), [#​91874](openclaw/openclaw#91874), [#​91904](openclaw/openclaw#91904), [#​91478](openclaw/openclaw#91478), [#​91915](openclaw/openclaw#91915)) Thanks [@​codysai001](https://github.com/codysai001), [@​alexzhu0](https://github.com/alexzhu0), [@​joelnishanth](https://github.com/joelnishanth), [@​snowzlm](https://github.com/snowzlm), [@​obviyus](https://github.com/obviyus), and [@​sallyom](https://github.com/sallyom). - iMessage recovery and delivery now cover always-on inbound restart, durable echo markers, block streaming, idle approval discovery, hardened outbound transport, and actionable inbound startup diagnostics. ([#​91335](openclaw/openclaw#91335), [#​91449](openclaw/openclaw#91449), [#​88969](openclaw/openclaw#88969), [#​88530](openclaw/openclaw#88530), [#​91783](openclaw/openclaw#91783), [#​91785](openclaw/openclaw#91785)) Thanks [@​omarshahine](https://github.com/omarshahine), [@​jmissig](https://github.com/jmissig), and [@​colmbrogan](https://github.com/colmbrogan). - Browser and MCP connectivity gained existing-session CDP support, discovered WebSocket validation, default-profile `cdpUrl` handling, safer browser-output boundaries, Streamable HTTP loopback transport, corrected OAuth/SSE authorization handling, and broader schema compatibility. ([#​91422](openclaw/openclaw#91422), [#​89851](openclaw/openclaw#89851), [#​91736](openclaw/openclaw#91736), [#​91747](openclaw/openclaw#91747), [#​91451](openclaw/openclaw#91451), [#​80143](openclaw/openclaw#80143)) Thanks [@​pgondhi987](https://github.com/pgondhi987), [@​anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia), [@​lifuyue](https://github.com/lifuyue), [@​eleqtrizit](https://github.com/eleqtrizit), [@​LiuwqGit](https://github.com/LiuwqGit), and [@​HemantSudarshan](https://github.com/HemantSudarshan). - Control UI startup and first-reply latency are lower through cached model metadata, removal of the startup catalog wait, lazy slash-command loading, and first-event tracing with slow-reply diagnostics. ([#​91531](openclaw/openclaw#91531), [#​91538](openclaw/openclaw#91538), [#​91568](openclaw/openclaw#91568), [#​91583](openclaw/openclaw#91583), [#​91598](openclaw/openclaw#91598)) - Provider support expands with OpenRouter OAuth onboarding and Claude Fable 5 adaptive thinking, while Codex sessions keep correct compaction ownership, local models skip guardian review, dynamic tool progress normalizes cleanly, and Gemma 4 reasoning replay is preserved. ([#​91830](openclaw/openclaw#91830), [#​91882](openclaw/openclaw#91882), [#​91590](openclaw/openclaw#91590), [#​88630](openclaw/openclaw#88630), [#​88768](openclaw/openclaw#88768), [#​91696](openclaw/openclaw#91696)) Thanks [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@​joshavant](https://github.com/joshavant), [@​bdjben](https://github.com/bdjben), and [@​Coder-Wangyankun](https://github.com/Coder-Wangyankun). ##### Changes - CLI progress: emit Claude CLI commentary progress events and bridge inter-tool commentary into channel progress without exposing internal protocol scaffolding. ([#​89834](openclaw/openclaw#89834), [#​90883](openclaw/openclaw#90883)) Thanks [@​anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia). - Observability: allow trusted diagnostics channels to capture tool input/output content, add first-assistant-event traces, and warn on slow initial replies. ([#​91256](openclaw/openclaw#91256), [#​91568](openclaw/openclaw#91568), [#​91583](openclaw/openclaw#91583)) Thanks [@​amknight](https://github.com/amknight). - Plugins/ClawHub: dogfood reusable package publishing, let dry runs skip publish approval, allow declared installed trusted hooks, report managed plugin version drift, and warn instead of failing on retired Skill Workshop configuration. ([#​91574](openclaw/openclaw#91574), [#​91591](openclaw/openclaw#91591), [#​90004](openclaw/openclaw#90004), [#​90927](openclaw/openclaw#90927), [#​90838](openclaw/openclaw#90838)) Thanks [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@​brokemac79](https://github.com/brokemac79), and [@​lonexreb](https://github.com/lonexreb). - Memory/providers: move the local llama.cpp runtime into its provider plugin, batch embeddings across files, persist the agent model catalog cache, and keep QMD JSON search one-shot while filtering stale REM recall previews. ([#​91324](openclaw/openclaw#91324), [#​89138](openclaw/openclaw#89138), [#​90457](openclaw/openclaw#90457), [#​91837](openclaw/openclaw#91837), [#​91851](openclaw/openclaw#91851)) Thanks [@​osolmaz](https://github.com/osolmaz), [@​mushuiyu886](https://github.com/mushuiyu886), [@​ai-hpc](https://github.com/ai-hpc), and [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - Channels/mobile: add the QQBot group mention toggle, improve iPad and iPhone control surfaces, and expose the active connection host in the TUI footer. ([#​91423](openclaw/openclaw#91423), [#​91557](openclaw/openclaw#91557), [#​89909](openclaw/openclaw#89909)) Thanks [@​cxyhhhhh](https://github.com/cxyhhhhh), [@​Solvely-Colin](https://github.com/Solvely-Colin), and [@​baskduf](https://github.com/baskduf). - Performance: prewarm TUI runtime plugins, deduplicate plugin auto-enable fanout, trim dense text-delta snapshots, and reuse prepared startup model metadata. ([#​90782](openclaw/openclaw#90782), [#​89978](openclaw/openclaw#89978), [#​91580](openclaw/openclaw#91580), [#​91531](openclaw/openclaw#91531)) Thanks [@​RomneyDa](https://github.com/RomneyDa) and [@​ai-hpc](https://github.com/ai-hpc). ##### Fixes - Agent/session recovery: drop stale approval follow-ups after session rebind, remove drained reply-queue items by identity, recover stale main and visible replies, preserve Codex context-engine compaction ownership, lower the default compaction timeout to 180 seconds while respecting explicit configuration, and keep provider-failure terminal lifecycle state correct. ([#​85679](openclaw/openclaw#85679), [#​91450](openclaw/openclaw#91450), [#​91566](openclaw/openclaw#91566), [#​91840](openclaw/openclaw#91840), [#​91590](openclaw/openclaw#91590), [#​91361](openclaw/openclaw#91361), [#​91895](openclaw/openclaw#91895)) Thanks [@​openperf](https://github.com/openperf), [@​yetval](https://github.com/yetval), [@​joshavant](https://github.com/joshavant), [@​wangmiao0668000666](https://github.com/wangmiao0668000666), and [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - User-visible content boundaries: suppress Codex/Harmony protocol artifacts, neutralize browser and LanceDB memory media directives, redact transcript images, and preserve native `/compact` replies through source suppression. ([#​89151](openclaw/openclaw#89151), [#​91422](openclaw/openclaw#91422), [#​91425](openclaw/openclaw#91425), [#​91529](openclaw/openclaw#91529), [#​90212](openclaw/openclaw#90212)) Thanks [@​joelnishanth](https://github.com/joelnishanth), [@​pgondhi987](https://github.com/pgondhi987), [@​joshavant](https://github.com/joshavant), and [@​snowzlm](https://github.com/snowzlm). - Channel delivery: keep WhatsApp captured replies attached to the successor controller after restart, retry Feishu rate limits, preserve Mattermost thread replies, canonicalize LINE webhook paths, restore Discord reply hydration and runtime timeout exports, and show OpenAI Realtime WebRTC assistant transcripts. ([#​85823](openclaw/openclaw#85823), [#​89659](openclaw/openclaw#89659), [#​91684](openclaw/openclaw#91684), [#​91649](openclaw/openclaw#91649), [#​90263](openclaw/openclaw#90263), [#​91686](openclaw/openclaw#91686), [#​90426](openclaw/openclaw#90426)) Thanks [@​itsuzef](https://github.com/itsuzef), [@​ladygege](https://github.com/ladygege), [@​jacobtomlinson](https://github.com/jacobtomlinson), [@​fuller-stack-dev](https://github.com/fuller-stack-dev), and [@​shushushv](https://github.com/shushushv). - Cron: cancel active task runs cleanly, preserve terminal timeout/cancel state, and recover no-deliver tool warnings instead of silently losing the outcome. ([#​90666](openclaw/openclaw#90666), [#​90678](openclaw/openclaw#90678)) Thanks [@​ai-hpc](https://github.com/ai-hpc). - Gateway/config/auth: share the approval runtime socket token, replace arrays explicitly in `config.patch`, skip the deleted-agent guard only for valid ACP harness sessions, surface headless LaunchAgent state, verify SQLite auth migration before cleanup, and arm QMD startup maintenance. ([#​87105](openclaw/openclaw#87105), [#​91551](openclaw/openclaw#91551), [#​91219](openclaw/openclaw#91219), [#​91614](openclaw/openclaw#91614), [#​91740](openclaw/openclaw#91740), [#​91978](openclaw/openclaw#91978)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev) and [@​scotthuang](https://github.com/scotthuang). - Providers/Codex: clarify quota errors, restore the Codex synthetic usage line, canonicalize Codex protocol assets, require API-key auth for realtime voice, normalize ACP model refs, preserve Gemma 4 `reasoning_content`, and avoid guardian review for local models. ([#​91390](openclaw/openclaw#91390), [#​91709](openclaw/openclaw#91709), [#​91507](openclaw/openclaw#91507), [#​91567](openclaw/openclaw#91567), [#​88630](openclaw/openclaw#88630), [#​91696](openclaw/openclaw#91696)) Thanks [@​hxy91819](https://github.com/hxy91819), [@​brokemac79](https://github.com/brokemac79), [@​RomneyDa](https://github.com/RomneyDa), [@​joshavant](https://github.com/joshavant), and [@​Coder-Wangyankun](https://github.com/Coder-Wangyankun). - Updates/builds: recover package Gateway restarts after refresh failure, expose plugin convergence repair, fall back to Corepack in PATH-less pnpm environments, seed the correct Docker store packages, and keep ClawHub dry-run and publish paths reusable. ([#​91581](openclaw/openclaw#91581), [#​91599](openclaw/openclaw#91599), [#​91547](openclaw/openclaw#91547), [#​91591](openclaw/openclaw#91591)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev), [@​sallyom](https://github.com/sallyom), and [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen). - UI: require explicit user intent before opening chat sessions and drain restored chat queues after session switches. ([#​91480](openclaw/openclaw#91480)) Thanks [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - Android: avoid the `dataSync` foreground-service type for persistent nodes. ([#​80082](openclaw/openclaw#80082)) Thanks [@​davelutztx](https://github.com/davelutztx). - Native hooks: bound relay lifetimes so abandoned native hook connections cannot linger indefinitely. ([#​91550](openclaw/openclaw#91550)) Thanks [@​joshavant](https://github.com/joshavant). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/1040
Opened on behalf of Onur Solmaz (
osolmaz). This is AI-assisted and ready for maintainer review.Summary
Users who configured local memory embeddings could lose
node-llama-cppduring an OpenClaw npm update.The root package no longer owns that native dependency, so installing it manually next to OpenClaw is not stable.
This change makes local GGUF embeddings an official
llama-cppprovider plugin that ownsnode-llama-cpp, whilememory-corekeeps owning memory indexing and search.Fixes #88705
Maintainer Direction
Vincent Koc (
vincentkoc) approved creating an extension/plugin for this on Discord. This PR follows that direction by adding the officialllama-cppprovider plugin instead of puttingnode-llama-cppback in the main OpenClaw package.What Changed
Local embeddings still use
memorySearch.provider: "local".The difference is that
localis now provided by@openclaw/llama-cpp-provider, not by the main OpenClaw package.Existing local-memory configs get a repair path through doctor, and the Gateway starts the provider plugin when local memory embeddings are configured.
extensions/llama-cppwith plugin metadata,node-llama-cpp@3.18.1, shrinkwrap, provider registration, tests, and docs.localembedding provider registration frommemory-core.node-llama-cppfrom the plugin package URL.models.providers.local-gpu.api = "local", start the owning plugin.memorySearch.provider: "local"maps to the officialllama-cppplugin install.Testing
I tested the provider boundary, startup selection, doctor repair, memory host worker import path, docs list, type surfaces, the full build, and a real package/update-shaped local embedding flow.
The first direct build and pnpm docs command hit pnpm fetch failures in the sandbox, so I reran the build with network approval and ran the docs list script directly.
node scripts/generate-npm-shrinkwrap.mjs --package-dir extensions/llama-cpp --check./node_modules/.bin/oxfmt --check --threads=1 .github/labeler.yml docs/concepts/memory-builtin.md docs/concepts/memory-qmd.md docs/concepts/memory-search.md docs/docs.json docs/plugins/plugin-inventory.md docs/plugins/reference.md docs/plugins/reference/memory-core.md docs/plugins/llama-cpp.md docs/plugins/reference/llama-cpp.md docs/reference/memory-config.md extensions/memory-core/openclaw.plugin.json extensions/memory-core/src/memory/embeddings.test.ts extensions/memory-core/src/memory/embeddings.ts extensions/memory-core/src/memory/index.test.ts extensions/memory-core/src/memory/provider-adapters.ts extensions/llama-cpp/index.ts extensions/llama-cpp/index.test.ts extensions/llama-cpp/openclaw.plugin.json extensions/llama-cpp/package.json extensions/llama-cpp/src/embedding-provider.ts package.json packages/memory-host-sdk/src/host/embeddings-worker.ts packages/memory-host-sdk/src/host/embeddings.test.ts packages/memory-host-sdk/src/host/embeddings.ts packages/memory-host-sdk/src/host/node-llama.ts scripts/lib/official-external-plugin-catalog.json src/commands/doctor-memory-search.test.ts src/commands/doctor-memory-search.ts src/commands/doctor/shared/missing-configured-plugin-install.test.ts src/commands/doctor/shared/missing-configured-plugin-install.ts src/gateway/server-startup-plugins.test.ts src/plugins/channel-plugin-ids.test.ts src/plugins/gateway-startup-plugin-ids.ts src/plugins/official-external-plugin-catalog.test.tsgit diff --checknode scripts/run-vitest.mjs extensions/llama-cpp/index.test.ts packages/memory-host-sdk/src/host/embeddings.test.ts extensions/memory-core/src/memory/embeddings.test.ts extensions/memory-core/src/memory/index.test.ts src/commands/doctor-memory-search.test.ts src/commands/doctor/shared/missing-configured-plugin-install.test.ts src/plugins/official-external-plugin-catalog.test.ts src/plugins/channel-plugin-ids.test.ts src/gateway/server-startup-plugins.test.tsnode scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfonode scripts/run-tsgo.mjs -p tsconfig.extensions.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/extensions.tsbuildinfonode scripts/docs-list.jsnode scripts/build-all.mjsnode scripts/run-vitest.mjs src/plugins/contracts/extension-runtime-dependencies.contract.test.ts test/openclaw-npm-release-check.test.ts test/plugin-npm-package-manifest.test.tscodex review --base mainReal Behavior Proof
Crabbox package/update proof:
cbx_840624445686run_b931f3d21df8v24.16.0, npm11.13.0, pnpm11.2.2node scripts/crabbox-wrapper.mjs run --provider azure --idle-timeout 90m --ttl 240m --timing-json --script .crabbox/scripts/llama-cpp-provider-proof.sh0, total6m33.536sWhat the proof did:
@openclaw/llama-cpp-providerplugin with package-local runtime metadata and current shrinkwrap.openclaw@latestplus a separately installed globalnode-llama-cpp@3.18.1.node-llama-cppin this environment, so the proof removed that root package explicitly before continuing. This keeps the post-update state equivalent to issue Bug: npm updates drop node-llama-cpp, breaking local memory_search after every OpenClaw update #88705: OpenClaw hasmemorySearch.provider: "local"but no root-levelnode-llama-cpp.memory status --deepreports the missing@openclaw/llama-cpp-providerbefore repair.doctor --fixagainst a local npm registry serving the PR provider package; doctor installed missing configured plugin"llama-cpp"from@openclaw/llama-cpp-provider.node-llama-cppunder OpenClaw plugin npm state, not under the root OpenClaw package.openclaw memory status --deep: providerlocal, modelhf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf, embeddingsready, vector storeready, semantic vectorsready.openclaw memory index --force.openclaw memory search <marker> --jsonand verified the result contained the marker through the installed llama.cpp provider.Proof marker:
OPENCLAW_LLAMA_CPP_PROVIDER_PROOF_OKRisks
The main risk is native package installation across platforms.
The code keeps the native dependency out of the main package and pins it in the provider plugin shrinkwrap, but release validation should still prove the new provider package installs and loads on supported platforms.
localfrom a memory-core built-in to an official external provider plugin.openclaw doctor --fixor plugin install/update flow to install@openclaw/llama-cpp-provider.