[Ubuntu 24.04][Onboard][GitHub Issue #4564] non-interactive custom-provider onboard does not start model-router; sandbox cannot reach inference.local

## Description

NemoClaw v0.0.57 non-interactive onboard with `NEMOCLAW_PROVIDER=custom` completes with "Hermes is ready" but the host-side model-router is never started. Sandbox cannot resolve `inference.local`, and the bot's first inference call returns HTTP 401 because Hermes falls back to the `no-key-required` placeholder (`hermes_cli/model_switch.py:906`) instead of using a real API key.

This appears to be a regression of NVB 6244574 / GitHub Issue #4564 — the host-alias binding fix in commit `44b23d047` does not cover the non-interactive custom-provider path on Linux Docker-driver.

## Environment

```text
Device:        Internal NVIDIA test host (lynnhu@10.6.76.35 via testmind-dev.nvidia.com)
OS:            Ubuntu 24.04.4 LTS (Noble Numbat)
Architecture:  x86_64
Node.js:       v22.22.3
npm:           10.9.8
Docker:        29.5.2 (build 79eb04c)
OpenShell CLI: 0.0.44
NemoClaw:      v0.0.57 (nemohermes shim)
OpenClaw:      N/A — running Hermes Agent v0.14.0 (build 2026.5.16, Python 3.13.5)
```

## Steps to Reproduce

1. On a fresh Ubuntu 24.04 host with Docker running, export non-interactive vars and `curl|bash`:

   ```bash
   export NEMOCLAW_INSTALL_TAG=v0.0.57 \
          NEMOCLAW_AGENT=hermes \
          NEMOCLAW_SANDBOX_NAME=momo \
          NEMOCLAW_PROVIDER=custom \
          NEMOCLAW_ENDPOINT_URL=https://inference-api.nvidia.com/ \
          NEMOCLAW_MODEL=aws/anthropic/bedrock-claude-opus-4-6 \
          NEMOCLAW_PROVIDER_KEY=<sk-...> \
          COMPATIBLE_API_KEY=<sk-...> \
          NEMOCLAW_HERMES_AUTH_METHOD=api_key \
          NEMOCLAW_NON_INTERACTIVE=1 \
          NEMOCLAW_YES=1 \
          NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1
   curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
   ```

2. Wait for "Hermes is ready" banner.

3. Check host for model-router:

   ```bash
   ls ~/.nemoclaw/model-router-venv   # does NOT exist
   ps -ef | grep model-router         # no process
   ```

4. `docker exec` into the sandbox container, test DNS:

   ```bash
   curl -sS https://inference.local/v1/models
   # curl: (6) Could not resolve host: inference.local
   getent hosts inference.local
   # (empty)
   ```

5. After configuring a Slack channel (or via any Hermes inference path), @-mention the bot:
   - Bot replies: `:warning: Non-retryable error (HTTP 401): HTTP 401: LiteLLM Virtual Key expected. Received=no-k_**_ired, expected to start with 'sk-'.`

## Expected Result

After "Hermes is ready":

- `~/.nemoclaw/model-router-venv` exists on host
- A model-router process is running and listening
- Sandbox can resolve `inference.local` (via `/etc/hosts` injection or Docker `--add-host`)
- Bot's first @-mention triggers a successful inference call

## Actual Result

Onboard reports success but:

- `~/.nemoclaw/model-router-venv` does not exist
- No model-router process on host
- Sandbox `/etc/hosts` has no `inference.local` entry; DNS resolution fails
- `/sandbox/.hermes/config.yaml` has `model.base_url=https://inference.local/v1` but no `model.api_key`
- `/sandbox/.hermes/.env` does NOT contain `COMPATIBLE_API_KEY` (`src/lib/state/sandbox.ts:534` actively strips credential lines from sandbox `.env`; the host router was supposed to inject auth)
- First @-mention to bot returns HTTP 401 `no-key-required` placeholder

**Code references** in v0.0.57 (HEAD `325ed77bb`, v0.0.57+4):

- `src/lib/onboard.ts:5298` — compatible-endpoint branch does not call `startModelRouter`
- `src/lib/onboard/model-router.ts:494` — `startModelRouter` is only invoked via `reconcileModelRouter` (gated by `isRoutedInferenceProvider`)
- `src/lib/verify-deployment.ts:223` — verifier explicitly checks `inference.local` resolution; not enforced during non-interactive onboard
- commit `44b23d047` "fix(onboard): reuse containerized gateway and repair routed provider reachability (#4520, #4564)" — closing fix for NVB 6244574; does not cover non-interactive custom-provider path on Ubuntu Docker-driver

**Workaround verified:** manually edit `/sandbox/.hermes/config.yaml` — set `model.base_url` to the real upstream URL (`https://inference-api.nvidia.com/v1`) and add `model.api_key` directly; also add `OPENAI_API_KEY` and `COMPATIBLE_API_KEY` to `/sandbox/.hermes/.env`. Then `pkill -f 'hermes gateway'` and re-run `hermes gateway run`. After that, @-mentions trigger successful inference (~3s latency, real LLM reply).

## Logs

```text
[sandbox-side DNS]
sandbox$ getent hosts inference.local
sandbox$ curl -sS https://inference.local/v1/models
curl: (6) Could not resolve host: inference.local

[sandbox agent.log on @-mention before workaround]
2026-06-03 11:52:25,785 ERROR [20260603_115224_6754050c] root: Non-retryable client error: Error code: 401 - {'error': {'message': "LiteLLM Virtual Key expected. Received=no-k****ired, expected to start with 'sk-'.", 'type': 'auth_error', 'param': 'None', 'code': '401'}}

[host process check]
$ ls ~/.nemoclaw/
onboard-session.json  rebuild-backups  sandboxes.json  source  usage-notice.json
$ ps -ef | grep -iE "(model.?router|inference-proxy)" | grep -v grep
(empty)
$ ss -tlnp | grep -E "(127.0.0.1:4000|model-router)"
(empty)

[sandbox agent.log after workaround — for contrast]
2026-06-03 12:07:15,237 INFO gateway.run: inbound message: platform=slack user=Lynn Hu chat=C0B741YC2T0 msg='hi from clean start'
2026-06-03 12:07:23,180 INFO run_agent: API call #1: model=aws/anthropic/bedrock-claude-opus-4-6 provider=custom in=16706 out=25 total=16731 latency=3.0s
2026-06-03 12:07:23,590 INFO gateway.run: response ready: platform=slack chat=C0B741YC2T0 time=8.4s api_calls=1 response=68 chars

[related closed bug — context, not duplicate]
NVB 6244574 [Brev][Inference][GitHub Issue #4564] Model Router inference.local returns "inference service unavailable"... — Closed 2026-06-03 02:05 (Bug - Fixed, QA - Closed - Verified). Same area, reopens on a different platform / non-interactive path.
```

---
[NVB#6263823](https://nvbugspro.nvidia.com/bug/6263823)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ubuntu 24.04][Onboard][GitHub Issue #4564] non-interactive custom-provider onboard does not start model-router; sandbox cannot reach inference.local #4711

Description

Environment

Steps to Reproduce

Expected Result

Actual Result

Logs

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Ubuntu 24.04][Onboard][GitHub Issue #4564] non-interactive custom-provider onboard does not start model-router; sandbox cannot reach inference.local #4711

Description

Description

Environment

Steps to Reproduce

Expected Result

Actual Result

Logs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions