🎯 Feature Goal
Currently, openclaw-portable runs OpenClaw entirely from a USB drive — Node.js, config, workspace — but still requires an external API key or a separately installed Ollama to actually talk to any AI model. This is the last missing piece for a true zero-dependency offline experience.
This issue proposes bundling a small, CPU-only local model directly inside the portable package, launching it as a sidecar process alongside the OpenClaw gateway, and auto-configuring it as the default model — so users plug in the USB and AI just works, with no internet, no API key, no pre-installed software.
🏗️ Architecture Analysis (based on current repo structure)
After reviewing the codebase, the integration fits cleanly into the existing launch flow:
start.sh / start.bat
│
├─ [2/5] 设置环境 (Node, OpenClaw binary check)
├─ [NEW] [3/5] Launch bundled llama-server sidecar ← INSERT HERE
├─ [4/5] 初始化工作目录
└─ [5/5] openclaw gateway start
The install-models.js + deepMerge config mechanism already provides the exact hook needed to inject the bundled model provider config into openclaw.json — no architectural changes required.
📦 Recommended Model: Qwen2.5-1.5B-Instruct Q4_K_M
| Attribute |
Value |
| Disk size |
~900 MB |
| RAM usage |
~1.2 GB |
| Inference engine |
llama.cpp llama-server (static binary, no install) |
| Tool calling |
✅ Native support |
| Context window |
32k tokens |
| CPU speed (4-core) |
~8–12 tok/s |
| License |
Apache 2.0 ✅ redistributable |
This is the smallest model with reliable tool-calling support, which OpenClaw's agent runtime requires. Smaller models (0.5B) lack consistent JSON function-call formatting.
📁 Proposed Directory Layout
Add the following structure inside the portable package (alongside existing node/, config/, data/):
openclaw-portable/
├── node/ # existing
├── npm-global/ # existing
├── config/ # existing
├── data/ # existing
├── llm/ # NEW
│ ├── bin/
│ │ ├── llama-server-linux-x86_64 # ~8MB static binary
│ │ ├── llama-server-macos-arm64 # ~9MB
│ │ ├── llama-server-macos-x86_64 # ~9MB
│ │ └── llama-server-win32-avx2.exe # ~10MB
│ ├── models/
│ │ └── qwen2.5-1.5b-instruct-q4_k_m.gguf # ~900MB
│ └── server.log # runtime, gitignored
├── start.sh # MODIFIED
├── start.bat # MODIFIED
├── stop.sh # MODIFIED
└── stop.bat # MODIFIED
The llm/models/ directory should be listed in .gitignore and distributed via GitHub Releases as a separate download or via the build script.
🔧 Implementation Details
1. start.sh — Add sidecar launch step (between step 2 and current step 3)
# ============================================
# [NEW] 3/6 启动内置本地模型 (llama-server)
# ============================================
echo -e "${BLUE}[3/6] 启动内置本地模型...${NC}"
LLM_DIR="$USB_PATH/llm"
LLM_PORT=18080
LLM_PID_FILE="$LLM_DIR/server.pid"
LLM_LOG="$LLM_DIR/server.log"
# Detect platform binary
OS_TYPE="$(uname -s)"
ARCH_TYPE="$(uname -m)"
case "$OS_TYPE" in
Linux*) LLM_BIN="$LLM_DIR/bin/llama-server-linux-x86_64" ;;
Darwin*)
if [ "$ARCH_TYPE" = "arm64" ]; then
LLM_BIN="$LLM_DIR/bin/llama-server-macos-arm64"
else
LLM_BIN="$LLM_DIR/bin/llama-server-macos-x86_64"
fi ;;
*) LLM_BIN="" ;;
esac
LLM_MODEL="$LLM_DIR/models/qwen2.5-1.5b-instruct-q4_k_m.gguf"
LLM_BUNDLED_READY=0
if [ -x "$LLM_BIN" ] && [ -f "$LLM_MODEL" ]; then
# Check if already running on port
if ! lsof -i :$LLM_PORT -sTCP:LISTEN -t &>/dev/null 2>&1; then
THREADS=$(( $(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 2) - 1 ))
THREADS=$(( THREADS < 1 ? 1 : THREADS ))
chmod +x "$LLM_BIN"
nohup "$LLM_BIN" \
--model "$LLM_MODEL" \
--port $LLM_PORT \
--host 127.0.0.1 \
--ctx-size 32768 \
--threads $THREADS \
--parallel 1 \
-ngl 0 \
--log-disable \
>> "$LLM_LOG" 2>&1 &
echo $! > "$LLM_PID_FILE"
echo -e "${GREEN}✅ llama-server 已启动 (PID: $!, port: $LLM_PORT, threads: $THREADS)${NC}"
echo -e "${YELLOW} 模型加载中,首次响应约需 5-15 秒...${NC}"
LLM_BUNDLED_READY=1
else
echo -e "${GREEN}✅ 内置模型已在运行 (port: $LLM_PORT)${NC}"
LLM_BUNDLED_READY=1
fi
else
echo -e "${YELLOW}⚠️ 内置模型未找到,跳过 (仍可使用云端 API)${NC}"
fi
2. Inject model config into openclaw.json before gateway starts
Add this block after the config file is initialized but before openclaw gateway start:
# Auto-inject bundled model config if model is available and user has no primary model set
if [ $LLM_BUNDLED_READY -eq 1 ]; then
BUNDLED_MODEL_CONFIG=$(cat <<'JSONEOF'
{
"models": {
"mode": "merge",
"providers": {
"bundled-local": {
"baseUrl": "http://127.0.0.1:18080/v1",
"apiKey": "bundled-no-key",
"api": "openai-completions",
"models": [
{
"id": "qwen2.5-1.5b",
"name": "Qwen2.5 1.5B (Bundled CPU)",
"contextWindow": 32768,
"maxTokens": 4096,
"cost": { "input": 0, "output": 0 }
}
]
}
}
}
}
JSONEOF
)
# Use existing install-models.js merge mechanism
echo "$BUNDLED_MODEL_CONFIG" > "$TEMP_DIR/bundled-model-inject.json"
# Only set as default if user has NOT configured a primary model
HAS_PRIMARY=$(node -e "
try {
const cfg = JSON.parse(require('fs').readFileSync('$TEMP_DIR/openclaw.json','utf8'));
console.log(cfg?.agents?.defaults?.model?.primary ? 'yes' : 'no');
} catch(e) { console.log('no'); }
" 2>/dev/null || echo "no")
if [ "$HAS_PRIMARY" = "no" ]; then
# Merge and set as default
node -e "
const fs = require('fs');
const cfgPath = '$TEMP_DIR/openclaw.json';
const cfg = fs.existsSync(cfgPath) ? JSON.parse(fs.readFileSync(cfgPath,'utf8')) : {};
const inject = JSON.parse(fs.readFileSync('$TEMP_DIR/bundled-model-inject.json','utf8'));
// Deep merge providers
cfg.models = cfg.models || {};
cfg.models.providers = Object.assign({}, cfg.models.providers, inject.models.providers);
cfg.models.mode = 'merge';
// Set as default only if no primary configured
cfg.agents = cfg.agents || {};
cfg.agents.defaults = cfg.agents.defaults || {};
cfg.agents.defaults.model = cfg.agents.defaults.model || {};
cfg.agents.defaults.model.primary = 'bundled-local/qwen2.5-1.5b';
fs.writeFileSync(cfgPath, JSON.stringify(cfg, null, 2));
console.log('✅ 内置模型已设为默认模型');
" 2>/dev/null && echo -e "${GREEN} bundled-local/qwen2.5-1.5b 已注册为默认模型${NC}"
rm -f "$TEMP_DIR/bundled-model-inject.json"
else
echo -e "${CYAN} 检测到已配置主模型,内置模型作为备用 (fallback)${NC}"
fi
fi
3. stop.sh — Clean up llama-server on shutdown
# Kill bundled llama-server
LLM_PID_FILE="$USB_PATH/llm/server.pid"
if [ -f "$LLM_PID_FILE" ]; then
LLM_PID=$(cat "$LLM_PID_FILE")
if kill -0 "$LLM_PID" 2>/dev/null; then
kill "$LLM_PID"
echo -e "${GREEN}✅ 内置模型已停止 (PID: $LLM_PID)${NC}"
fi
rm -f "$LLM_PID_FILE"
fi
4. start.bat — Windows equivalent (PowerShell snippet)
REM === Start bundled llama-server ===
SET LLM_BIN=%USB_PATH%\llm\bin\llama-server-win32-avx2.exe
SET LLM_MODEL=%USB_PATH%\llm\models\qwen2.5-1.5b-instruct-q4_k_m.gguf
SET LLM_PORT=18080
IF EXIST "%LLM_BIN%" IF EXIST "%LLM_MODEL%" (
netstat -ano | findstr :%LLM_PORT% >nul 2>&1
IF ERRORLEVEL 1 (
FOR /F "tokens=1" %%i IN ('wmic cpu get NumberOfLogicalProcessors /value ^| find "="') DO SET /A THREADS=%%i-1
IF %THREADS% LSS 1 SET THREADS=1
START /B "" "%LLM_BIN%" --model "%LLM_MODEL%" --port %LLM_PORT% --host 127.0.0.1 --ctx-size 32768 --threads %THREADS% --parallel 1 -ngl 0 >> "%USB_PATH%\llm\server.log" 2>&1
ECHO [OK] llama-server started on port %LLM_PORT%
SET LLM_BUNDLED_READY=1
) ELSE (
ECHO [OK] Bundled model already running
SET LLM_BUNDLED_READY=1
)
) ELSE (
ECHO [WARN] Bundled model not found, skipping
SET LLM_BUNDLED_READY=0
)
📥 Model Distribution Strategy
The ~900MB GGUF file should NOT be committed to git. Recommended approach:
Option A (recommended): GitHub Releases attachment
# In build-offline-package.sh, add:
download_bundled_model() {
MODEL_URL="https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct-GGUF/resolve/main/qwen2.5-1.5b-instruct-q4_k_m.gguf"
MODEL_DIR="$SCRIPT_DIR/llm/models"
mkdir -p "$MODEL_DIR"
if [ ! -f "$MODEL_DIR/qwen2.5-1.5b-instruct-q4_k_m.gguf" ]; then
echo "Downloading bundled model (~900MB)..."
curl -L --progress-bar -o "$MODEL_DIR/qwen2.5-1.5b-instruct-q4_k_m.gguf" "$MODEL_URL"
fi
}
Option B: First-run auto-download
Add a setup-llm.sh script that users run once to download the model into llm/models/. The start.sh gracefully degrades if the model is absent.
.gitignore additions:
llm/models/*.gguf
llm/bin/llama-server*
llm/server.log
llm/server.pid
⚠️ Known Risks & Mitigations
| Risk |
Mitigation |
| Cold-start latency (5-15s model load) |
Show explicit "加载中" message; start llama-server before gateway |
| Port 18080 conflict |
Check with lsof/netstat before starting; fall back to 18081 |
| AVX2 not supported (old CPUs) |
Detect with grep avx2 /proc/cpuinfo; warn user and skip |
| USB read speed bottleneck |
Copy model to $TEMP_DIR on first run if USB speed < 50MB/s |
| Context overflow (32k) |
Enable OpenClaw compaction in injected config: "compaction": { "enabled": true } |
| User already has API keys |
Respect existing agents.defaults.model.primary; register bundled as fallback only |
🪜 Suggested Implementation Phases
💡 Why This Matters
This feature would make openclaw-portable the first USB-bootable AI assistant that:
- Requires zero internet after initial setup
- Requires zero pre-installed software (no Ollama, no Python, no Docker)
- Has zero API cost for basic usage
- Works on any x86-64/ARM64 machine, just plug and run
The existing infrastructure (install-models.js merge logic, start.sh modular step structure, data/.openclaw/ config path) makes this integration very clean — this is essentially adding one new service to an already well-designed process manager.
Interested in contributing a draft PR for Phase 1 if the maintainer approves the direction.
🎯 Feature Goal
Currently,
openclaw-portableruns OpenClaw entirely from a USB drive — Node.js, config, workspace — but still requires an external API key or a separately installed Ollama to actually talk to any AI model. This is the last missing piece for a true zero-dependency offline experience.This issue proposes bundling a small, CPU-only local model directly inside the portable package, launching it as a sidecar process alongside the OpenClaw gateway, and auto-configuring it as the default model — so users plug in the USB and AI just works, with no internet, no API key, no pre-installed software.
🏗️ Architecture Analysis (based on current repo structure)
After reviewing the codebase, the integration fits cleanly into the existing launch flow:
The
install-models.js+deepMergeconfig mechanism already provides the exact hook needed to inject the bundled model provider config intoopenclaw.json— no architectural changes required.📦 Recommended Model:
Qwen2.5-1.5B-Instruct Q4_K_Mllama-server(static binary, no install)This is the smallest model with reliable tool-calling support, which OpenClaw's agent runtime requires. Smaller models (0.5B) lack consistent JSON function-call formatting.
📁 Proposed Directory Layout
Add the following structure inside the portable package (alongside existing
node/,config/,data/):The
llm/models/directory should be listed in.gitignoreand distributed via GitHub Releases as a separate download or via the build script.🔧 Implementation Details
1.
start.sh— Add sidecar launch step (between step 2 and current step 3)2. Inject model config into
openclaw.jsonbefore gateway startsAdd this block after the config file is initialized but before
openclaw gateway start:3.
stop.sh— Clean up llama-server on shutdown4.
start.bat— Windows equivalent (PowerShell snippet)📥 Model Distribution Strategy
The ~900MB GGUF file should NOT be committed to git. Recommended approach:
Option A (recommended): GitHub Releases attachment
Option B: First-run auto-download
Add a
setup-llm.shscript that users run once to download the model intollm/models/. Thestart.shgracefully degrades if the model is absent..gitignoreadditions:lsof/netstatbefore starting; fall back to 18081grep avx2 /proc/cpuinfo; warn user and skip$TEMP_DIRon first run if USB speed < 50MB/s"compaction": { "enabled": true }agents.defaults.model.primary; register bundled as fallback only🪜 Suggested Implementation Phases
start.sh/stop.sh: Add llama-server sidecar lifecycle (no model bundled yet, just the plumbing)build-offline-package.sh: Adddownload_bundled_model()step; download llama binaries for all 4 platformsbundled-localprovider and set as default model inopenclaw.jsonstart.bat/stop.bat: Windows paritysetup-llm.sh: Standalone first-time model download helperREADME.mdandOFFLINE-GUIDE.mdwith bundled model section💡 Why This Matters
This feature would make
openclaw-portablethe first USB-bootable AI assistant that:The existing infrastructure (
install-models.jsmerge logic,start.shmodular step structure,data/.openclaw/config path) makes this integration very clean — this is essentially adding one new service to an already well-designed process manager.Interested in contributing a draft PR for Phase 1 if the maintainer approves the direction.