Description
Choosing Option 8 "Model Router (experimental)" in nemoclaw onboard crashes at step [4/8] Setting up inference provider because the host-side Model Router venv setup unconditionally picks the highest-version python3.X on PATH and runs python3.X -m ensurepip --upgrade --default-pip. If that python is broken at the stdlib level (e.g. Homebrew python@3.14 whose pyexpat extension links a libexpat symbol the system libexpat does not export), the bootstrap fails and onboarding aborts before any sandbox is built.
Three NemoClaw-side issues compound the host environment problem:
- No health probe — NemoClaw does not run a smoke test on the candidate interpreter (e.g.
python3.X -c "import pyexpat, ensurepip, ssl") before adopting it.
- No fallback — even though a healthy
python3.11 was present on the same PATH, NemoClaw never tried it.
- Error message hides the real cause — the user only sees "Failed to create Model Router virtual environment" with a one-line
ensurepip exit-status reference; nothing points at "your host python is broken, here is the import error". A new user will reasonably blame NemoClaw.
Environment
Device: MacBook (M4, Apple Silicon)
OS: macOS 26.1 (Darwin 25.1.0)
Architecture: arm64
Node.js: v23.10.0
npm: 11.3.0
Docker: 27.4.0 (Colima context)
OpenShell CLI: 0.0.39
NemoClaw: v0.0.44
OpenClaw: 2026.4.24 (cbcfdf6) (sandbox build was never reached)
Pythons on PATH:
/opt/homebrew/bin/python3.11 — healthy, pyexpat OK, ensurepip OK
/opt/homebrew/bin/python3.14 — Homebrew python@3.14 3.14.5, pyexpat broken
Steps to Reproduce
-
On a macOS host where python3.14 exists on PATH but its stdlib is broken (current Homebrew python@3.14 when system libexpat is older than the libexpat NemoClaw's pyexpat was built against). To force this state quickly:
brew install python@3.14
/opt/homebrew/bin/python3.14 -c "import pyexpat"
# If this raises "Symbol not found: _XML_SetAllocTrackerActivationThreshold" the host repros the bug.
-
Run:
-
At the Choose [1]: prompt type 8 to select Model Router (experimental).
-
Pick the default sandbox name and confirm with Y at the Review step.
-
Onboard advances to [4/8] Setting up inference provider → Starting model router... → Initializing Model Router source... → Preparing Model Router environment: /Users/<you>/.nemoclaw/model-router-venv.
-
ensurepip exits non-zero, onboard aborts.
Expected Result
- NemoClaw smoke-tests each candidate python interpreter (at minimum
import ensurepip, pyexpat, ssl) before adopting it for venv creation.
- If the highest-version python fails the probe, NemoClaw falls back to the next-highest healthy python (
python3.11 in this case is right there on PATH).
- If no candidate is healthy, the error surfaced to the user names the actual failing import (pyexpat dlopen error, missing symbol) and points at the broken host python — not just "Failed to create Model Router virtual environment".
- Pin a known-supported python version range in docs (e.g. 3.11–3.13) so users know what to install.
Actual Result
Onboard output (Option 8) before the crash:
Inference options:
8) Model Router (experimental)
Choose [1]: 8
✓ Using Model Router: nvidia-router / nvidia-routed
Sandbox name (...) [my-assistant]: lynntest
Review configuration
Provider: nvidia-router
Model: nvidia-routed
Apply this configuration? [Y/n]: Y
[4/8] Setting up inference provider
✓ Active gateway set to 'nemoclaw'
Starting model router...
Initializing Model Router source...
Submodule path 'nemoclaw-blueprint/router/llm-router': checked out '2bd8dfaa751efb60aa4e7e49b270490dfbc0a68a'
Cloning into '/Users/lynnh/.nemoclaw/source/nemoclaw-blueprint/router/llm-router'...
Preparing Model Router environment: /Users/lynnh/.nemoclaw/model-router-venv
Error: Command '['/Users/lynnh/.nemoclaw/model-router-venv/bin/python3.14', '-m', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1.
✗ Failed to start model router: Failed to create Model Router virtual environment.
Onboard exits non-zero. No sandbox is created. The user cannot use Model Router at all on this host until they manually fix Homebrew python@3.14 — even though python3.11 is right there and would work.
Logs
Running the same ensurepip command by hand reveals the real root cause:
$ /opt/homebrew/bin/python3.14 -m ensurepip --upgrade --default-pip --verbose
...
File "/opt/homebrew/Cellar/python@3.14/3.14.5/Frameworks/Python.framework/Versions/3.14/lib/python3.14/xml/parsers/expat.py", line 4, in <module>
from pyexpat import *
ImportError: dlopen(/opt/homebrew/Cellar/python@3.14/3.14.5/Frameworks/Python.framework/Versions/3.14/lib/python3.14/lib-dynload/pyexpat.cpython-314-darwin.so, 0x0002):
Symbol not found: _XML_SetAllocTrackerActivationThreshold
Referenced from: pyexpat.cpython-314-darwin.so
Expected in: /usr/lib/libexpat.1.dylib
Smoke-testing the two pythons NemoClaw could have picked:
$ /opt/homebrew/bin/python3.11 -c "import pyexpat; print('OK', pyexpat.version_info)"
OK (2, 7, 1)
$ /opt/homebrew/bin/python3.14 -c "import pyexpat"
ImportError: dlopen(...pyexpat.cpython-314-darwin.so): Symbol not found: _XML_SetAllocTrackerActivationThreshold
NemoClaw selected python3.14 anyway.
Related Bugs
Adjacent Model Router bugs (different stages, not duplicates of this):
- NVB#6180064 — Model Router accepts
nvapi- key but LiteLLM proxy rejects it (post-onboard config).
- NVB#6158321 — Model Router HTTP 503 after successful onboard (post-onboard runtime).
- NVB#6158324 — "Model Router API key:" prompt didn't document where to get the key (closed/fixed).
This one fails earlier than all three — venv setup, before any sandbox is built.
NVB#6189271
Description
Choosing Option 8 "Model Router (experimental)" in
nemoclaw onboardcrashes at step[4/8] Setting up inference providerbecause the host-side Model Router venv setup unconditionally picks the highest-versionpython3.XonPATHand runspython3.X -m ensurepip --upgrade --default-pip. If that python is broken at the stdlib level (e.g. Homebrewpython@3.14whosepyexpatextension links a libexpat symbol the system libexpat does not export), the bootstrap fails and onboarding aborts before any sandbox is built.Three NemoClaw-side issues compound the host environment problem:
python3.X -c "import pyexpat, ensurepip, ssl") before adopting it.python3.11was present on the samePATH, NemoClaw never tried it.ensurepipexit-status reference; nothing points at "your host python is broken, here is the import error". A new user will reasonably blame NemoClaw.Environment
Steps to Reproduce
On a macOS host where
python3.14exists onPATHbut its stdlib is broken (current Homebrewpython@3.14when systemlibexpatis older than the libexpat NemoClaw's pyexpat was built against). To force this state quickly:Run:
At the
Choose [1]:prompt type8to select Model Router (experimental).Pick the default sandbox name and confirm with
Yat the Review step.Onboard advances to
[4/8] Setting up inference provider→Starting model router...→Initializing Model Router source...→Preparing Model Router environment: /Users/<you>/.nemoclaw/model-router-venv.ensurepipexits non-zero, onboard aborts.Expected Result
import ensurepip, pyexpat, ssl) before adopting it for venv creation.python3.11in this case is right there onPATH).Actual Result
Onboard output (Option 8) before the crash:
Onboard exits non-zero. No sandbox is created. The user cannot use Model Router at all on this host until they manually fix Homebrew
python@3.14— even thoughpython3.11is right there and would work.Logs
Running the same
ensurepipcommand by hand reveals the real root cause:Smoke-testing the two pythons NemoClaw could have picked:
NemoClaw selected
python3.14anyway.Related Bugs
Adjacent Model Router bugs (different stages, not duplicates of this):
nvapi-key but LiteLLM proxy rejects it (post-onboard config).This one fails earlier than all three — venv setup, before any sandbox is built.
NVB#6189271