[Brev][Onboard] Model Router (Provider Routed) inference broken — TUI returns HTTP 503 after successful onboard

## Description

Description
<pre>After completing `nemoclaw onboard` with the new "Model Router (experimental)" inference option, the OpenClaw TUI returns HTTP 503 "inference service unavailable" on every prompt. Onboard reports SUCCESS but the generated `~/.nemoclaw/state/litellm-proxy.yaml` (and the upstream `nemoclaw-blueprint/router/pool-config.yaml` it derives from) contain three independent config errors that together prevent any request from reaching upstream NVIDIA inference. This blocks the entire Provider Routed code path 100%.
</pre>Environment
<pre>Device: Brev (shadeform brev-pz811qnfg) — H100 PCIe x1
OS: Ubuntu 22.04.5 LTS (kernel 6.8.0-90-generic)
Architecture: x86_64
Node.js: v22.22.2
npm: 10.9.7
Docker: 29.1.3 (build f52814d)
OpenShell CLI: 0.0.36
NemoClaw: v0.0.37
OpenClaw: 2026.4.24 (cbcfdf6)
</pre>Steps to Reproduce
<pre>1. curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
2. Type 'yes' to accept the license/notice.
3. At [3/8] Configuring inference, choose option 8 (Model Router (experimental)).
4. At "Model Router API key:" prompt, enter a valid NVIDIA API key (nvapi-...).
5. Sandbox name: route. Accept review. Skip messaging. Accept Balanced policy presets.
6. Wait for "Installation complete" — onboard reports all 8 steps SUCCESS.
7. nemoclaw route connect
8. sandbox@route$ openclaw tui
9. Send any prompt (e.g. "ping").
</pre>Expected Result
<pre>TUI returns a model response routed via the configured Model Router (NVIDIA Nemotron 3 Nano or Super, depending on prefill router decision).
</pre>Actual Result
<pre>TUI shows two consecutive errors and the gateway disconnects:

 HTTP 503: "inference service unavailable"
 HTTP 503: "inference service unavailable"
 gateway disconnected: closed | idle
 agent main | session main (openclaw-tui) | inference/nvidia-routed | tokens ?/131k

Hitting /health on the host model-router shows both upstream endpoints unhealthy:
 curl -s http://127.0.0.1:4000/health
 -> healthy_count: 0, unhealthy_count: 2
 -> error: "Authentication Error, LiteLLM Virtual Key expected. Received=nvap****Nrd-, expected to start with 'sk-'." (HTTP 401)
</pre>Root Cause — three independent config errors that compound
<pre>1. WRONG UPSTREAM ENDPOINT
 pool-config.yaml + litellm-proxy.yaml set api_base = https://inference-api.nvidia.com
 That host is itself a LiteLLM proxy that only accepts sk-* virtual keys; it rejects nvapi-* keys
 with HTTP 401 "LiteLLM Virtual Key expected".
 Verified: curl -H "Authorization: Bearer nvapi-..." https://inference-api.nvidia.com/v1/models -> 401
 The endpoint that actually accepts nvapi-* keys is https://integrate.api.nvidia.com/v1
 (public NVIDIA Build / NIM gateway). Verified: same key returns 200.

2. WRONG MODEL IDS (case + doubled prefix + non-existent super id)
 pool-config.yaml uses:
 openai/nvidia/nvidia/Nemotron-3-Nano-30B-A3B
 openai/nvidia/nvidia/nemotron-3-super-v3
 Issues:
 (a) "nvidia/nvidia/" prefix is doubled — should be single nvidia/.
 (b) Nano id case is wrong — actual catalog id is lowercase nemotron-3-nano-30b-a3b.
 (c) "nemotron-3-super-v3" does NOT exist in the NVIDIA catalog at all. The Super model id is
 nemotron-3-super-120b-a12b.
 Verified by enumerating https://integrate.api.nvidia.com/v1/models — only the lowercase ids resolve.

3. WRONG ENV VAR NAME (host vs gateway disagree, and the env var was never exported)
 ~/.nemoclaw/state/litellm-proxy.yaml says: api_key: os.environ/OPENAI_API_KEY
 ~/.nemoclaw/onboard-session.json records: "credentialEnv": "NVIDIA_API_KEY"
 `openshell provider get nvidia-router -g nemoclaw` shows: Credential keys: NVIDIA_API_KEY
 The host-side LiteLLM and the gateway-side provider config disagree on the env var name.
 Additionally, on the live model-router process (PID 236635), inspection of /proc/PID/environ
 shows neither OPENAI_API_KEY nor NVIDIA_API_KEY exported — onboard also fails to plumb the
 credential into the subprocess env.
</pre>Fix Verification
<pre>Manually patched ~/.nemoclaw/state/litellm-proxy.yaml:
 - model: openai/nvidia/nvidia/Nemotron-3-Nano-30B-A3B -> openai/nvidia/nemotron-3-nano-30b-a3b
 - model: openai/nvidia/nvidia/nemotron-3-super-v3 -> openai/nvidia/nemotron-3-super-120b-a12b
 - api_base: https://inference-api.nvidia.com -> https://integrate.api.nvidia.com/v1
 - api_key: os.environ/OPENAI_API_KEY -> (env var was empty)

Restarted model-router. Result:
 GET /health -> healthy: 2, unhealthy: 0 (was 0, 2 before)
 POST /v1/chat/completions model=nemotron-3-super -> "Two plus two equals four." (correct content)
 POST /v1/chat/completions model=nvidia-routed (alias) -> reasoning_content populated (routing works)
 POST /v1/chat/completions model=nemotron-3-nano-reasoning -> reasoning_content populated (nano works)

The same fix needs to land upstream in:
 - nemoclaw-blueprint/router/pool-config.yaml (model ids + api_base)
 - The onboard code that emits litellm-proxy.yaml (env var name should match credentialEnv)
 - The onboard code that exports NVIDIA_API_KEY into the model-router subprocess env
</pre>Logs
<pre>Pre-fix /health excerpt (sanitized):
 {"healthy_endpoints":[],"unhealthy_endpoints":[
 {"api_base":"https://inference-api.nvidia.com",
 "model":"openai/nvidia/nvidia/Nemotron-3-Nano-30B-A3B",
 "error":"litellm.AuthenticationError: ... Authentication Error, LiteLLM Virtual Key expected. Received=nvap****Nrd-, expected to start with 'sk-'."},
 {"api_base":"https://inference-api.nvidia.com",
 "model":"openai/nvidia/nvidia/nemotron-3-super-v3",
 "error":"... same auth error"}],
 "healthy_count":0,"unhealthy_count":2}

Sandbox openshell-router log (repeating ~30s loop until TUI gives up):
 [INFO] routing proxy inference request (streaming) endpoint=http://host.openshell.internal:4000/v1 path=/v1/chat/completions
 [LOW] NET:FAIL inference.local:443
</pre>

## Bug Details

| Field | Value |
|-------|-------|
| Priority | Unprioritized |
| Action | Dev - Open - To fix |
| Disposition | Open issue |
| Module | Machine Learning - NemoClaw |
| Keyword | NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Inference, NemoClaw_Onboard, NemoClaw-SWQA-RelBlckr-Recommended |

---
[NVB#6158321]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Brev][Onboard] Model Router (Provider Routed) inference broken — TUI returns HTTP 503 after successful onboard #3255

Description

Bug Details

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Field	Value
Priority	Unprioritized
Action	Dev - Open - To fix
Disposition	Open issue
Module	Machine Learning - NemoClaw
Keyword	NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Inference, NemoClaw_Onboard, NemoClaw-SWQA-RelBlckr-Recommended

[Brev][Onboard] Model Router (Provider Routed) inference broken — TUI returns HTTP 503 after successful onboard #3255

Description

Description

Bug Details

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions