Problem
Installing openclaw globally pulls in node-llama-cpp (3.15.1) as a hard dependency, which installs ~670MB of pre-compiled binaries for every supported platform and GPU backend:
| Directory |
Size |
Purpose |
linux-x64-cuda-ext/ |
432MB |
CUDA extended GPU binaries |
linux-x64-cuda/ |
144MB |
CUDA GPU binaries |
linux-x64-vulkan/ |
73MB |
Vulkan GPU binaries |
linux-x64/ |
19MB |
CPU-only binaries |
linux-armv7l/ |
5.6MB |
ARM 32-bit |
linux-arm64/ |
4.9MB |
ARM 64-bit |
This is especially wasteful in environments like GitHub Codespaces or CI/CD where:
- No GPU is available (CUDA/Vulkan binaries are useless)
- Only one platform is needed (ARM binaries on x64 are unused)
- Local LLM embedding (memory search) is not used — cloud APIs handle inference
For our team, this means 670MB of dead weight per codespace that we currently work around by deleting the binaries after install in our Dockerfile.
Proposal
Make node-llama-cpp an optional dependency (or move it to a plugin/extension) so users who don't need local LLM embedding can skip it entirely.
Suggested approaches (pick any):
- Move
node-llama-cpp from dependencies to optionalDependencies in package.json
- Extract local embedding into a separate package (e.g.,
@openclaw/memory-local)
- Use a lazy-install pattern — download llama binaries on first use, not at install time
Current workaround
# After npm install -g openclaw@latest:
RUN rm -rf /usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-x64-cuda-ext \
/usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-x64-cuda \
/usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-x64-vulkan \
/usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-arm*
Environment
openclaw@2026.2.15
node-llama-cpp@3.15.1
- GitHub Codespaces (linux-x64, no GPU)
- 32GB disk — 670MB is ~2% of total disk
Problem
Installing
openclawglobally pulls innode-llama-cpp(3.15.1) as a hard dependency, which installs ~670MB of pre-compiled binaries for every supported platform and GPU backend:linux-x64-cuda-ext/linux-x64-cuda/linux-x64-vulkan/linux-x64/linux-armv7l/linux-arm64/This is especially wasteful in environments like GitHub Codespaces or CI/CD where:
For our team, this means 670MB of dead weight per codespace that we currently work around by deleting the binaries after install in our Dockerfile.
Proposal
Make
node-llama-cppan optional dependency (or move it to a plugin/extension) so users who don't need local LLM embedding can skip it entirely.Suggested approaches (pick any):
node-llama-cppfromdependenciestooptionalDependenciesin package.json@openclaw/memory-local)Current workaround
Environment
openclaw@2026.2.15node-llama-cpp@3.15.1