Make node-llama-cpp an optional dependency to reduce install size by ~670MB

## Problem

Installing `openclaw` globally pulls in `node-llama-cpp` (3.15.1) as a hard dependency, which installs **~670MB** of pre-compiled binaries for every supported platform and GPU backend:

| Directory | Size | Purpose |
|-----------|------|---------|
| `linux-x64-cuda-ext/` | **432MB** | CUDA extended GPU binaries |
| `linux-x64-cuda/` | 144MB | CUDA GPU binaries |
| `linux-x64-vulkan/` | 73MB | Vulkan GPU binaries |
| `linux-x64/` | 19MB | CPU-only binaries |
| `linux-armv7l/` | 5.6MB | ARM 32-bit |
| `linux-arm64/` | 4.9MB | ARM 64-bit |

This is especially wasteful in environments like **GitHub Codespaces** or **CI/CD** where:
- No GPU is available (CUDA/Vulkan binaries are useless)
- Only one platform is needed (ARM binaries on x64 are unused)
- Local LLM embedding (memory search) is not used — cloud APIs handle inference

For our team, this means **670MB of dead weight per codespace** that we currently work around by deleting the binaries after install in our Dockerfile.

## Proposal

Make `node-llama-cpp` an **optional dependency** (or move it to a plugin/extension) so users who don't need local LLM embedding can skip it entirely.

Suggested approaches (pick any):
1. Move `node-llama-cpp` from `dependencies` to `optionalDependencies` in package.json
2. Extract local embedding into a separate package (e.g., `@openclaw/memory-local`)
3. Use a lazy-install pattern — download llama binaries on first use, not at install time

## Current workaround

```dockerfile
# After npm install -g openclaw@latest:
RUN rm -rf /usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-x64-cuda-ext \
           /usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-x64-cuda \
           /usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-x64-vulkan \
           /usr/local/share/npm-global/lib/node_modules/openclaw/node_modules/@node-llama-cpp/linux-arm*
```

## Environment

- `openclaw@2026.2.15`
- `node-llama-cpp@3.15.1`
- GitHub Codespaces (linux-x64, no GPU)
- 32GB disk — 670MB is ~2% of total disk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make node-llama-cpp an optional dependency to reduce install size by ~670MB #17988

Problem

Proposal

Current workaround

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Directory	Size	Purpose
`linux-x64-cuda-ext/`	432MB	CUDA extended GPU binaries
`linux-x64-cuda/`	144MB	CUDA GPU binaries
`linux-x64-vulkan/`	73MB	Vulkan GPU binaries
`linux-x64/`	19MB	CPU-only binaries
`linux-armv7l/`	5.6MB	ARM 32-bit
`linux-arm64/`	4.9MB	ARM 64-bit

Uh oh!

Make node-llama-cpp an optional dependency to reduce install size by ~670MB #17988

Description

Problem

Proposal

Current workaround

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions