Skip to content

[Ubuntu 24.04][Onboard][GitHub Issue #4564] non-interactive custom-provider onboard does not start model-router; sandbox cannot reach inference.local #4711

@hulynn

Description

@hulynn

Description

NemoClaw v0.0.57 non-interactive onboard with NEMOCLAW_PROVIDER=custom completes with "Hermes is ready" but the host-side model-router is never started. Sandbox cannot resolve inference.local, and the bot's first inference call returns HTTP 401 because Hermes falls back to the no-key-required placeholder (hermes_cli/model_switch.py:906) instead of using a real API key.

This appears to be a regression of NVB 6244574 / GitHub Issue #4564 — the host-alias binding fix in commit 44b23d047 does not cover the non-interactive custom-provider path on Linux Docker-driver.

Environment

Device:        Internal NVIDIA test host (lynnhu@10.6.76.35 via testmind-dev.nvidia.com)
OS:            Ubuntu 24.04.4 LTS (Noble Numbat)
Architecture:  x86_64
Node.js:       v22.22.3
npm:           10.9.8
Docker:        29.5.2 (build 79eb04c)
OpenShell CLI: 0.0.44
NemoClaw:      v0.0.57 (nemohermes shim)
OpenClaw:      N/A — running Hermes Agent v0.14.0 (build 2026.5.16, Python 3.13.5)

Steps to Reproduce

  1. On a fresh Ubuntu 24.04 host with Docker running, export non-interactive vars and curl|bash:

    export NEMOCLAW_INSTALL_TAG=v0.0.57 \
           NEMOCLAW_AGENT=hermes \
           NEMOCLAW_SANDBOX_NAME=momo \
           NEMOCLAW_PROVIDER=custom \
           NEMOCLAW_ENDPOINT_URL=https://inference-api.nvidia.com/ \
           NEMOCLAW_MODEL=aws/anthropic/bedrock-claude-opus-4-6 \
           NEMOCLAW_PROVIDER_KEY=<sk-...> \
           COMPATIBLE_API_KEY=<sk-...> \
           NEMOCLAW_HERMES_AUTH_METHOD=api_key \
           NEMOCLAW_NON_INTERACTIVE=1 \
           NEMOCLAW_YES=1 \
           NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1
    curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
  2. Wait for "Hermes is ready" banner.

  3. Check host for model-router:

    ls ~/.nemoclaw/model-router-venv   # does NOT exist
    ps -ef | grep model-router         # no process
  4. docker exec into the sandbox container, test DNS:

    curl -sS https://inference.local/v1/models
    # curl: (6) Could not resolve host: inference.local
    getent hosts inference.local
    # (empty)
  5. After configuring a Slack channel (or via any Hermes inference path), @-mention the bot:

    • Bot replies: :warning: Non-retryable error (HTTP 401): HTTP 401: LiteLLM Virtual Key expected. Received=no-k_**_ired, expected to start with 'sk-'.

Expected Result

After "Hermes is ready":

  • ~/.nemoclaw/model-router-venv exists on host
  • A model-router process is running and listening
  • Sandbox can resolve inference.local (via /etc/hosts injection or Docker --add-host)
  • Bot's first @-mention triggers a successful inference call

Actual Result

Onboard reports success but:

  • ~/.nemoclaw/model-router-venv does not exist
  • No model-router process on host
  • Sandbox /etc/hosts has no inference.local entry; DNS resolution fails
  • /sandbox/.hermes/config.yaml has model.base_url=https://inference.local/v1 but no model.api_key
  • /sandbox/.hermes/.env does NOT contain COMPATIBLE_API_KEY (src/lib/state/sandbox.ts:534 actively strips credential lines from sandbox .env; the host router was supposed to inject auth)
  • First @-mention to bot returns HTTP 401 no-key-required placeholder

Code references in v0.0.57 (HEAD 325ed77bb, v0.0.57+4):

Workaround verified: manually edit /sandbox/.hermes/config.yaml — set model.base_url to the real upstream URL (https://inference-api.nvidia.com/v1) and add model.api_key directly; also add OPENAI_API_KEY and COMPATIBLE_API_KEY to /sandbox/.hermes/.env. Then pkill -f 'hermes gateway' and re-run hermes gateway run. After that, @-mentions trigger successful inference (~3s latency, real LLM reply).

Logs

[sandbox-side DNS]
sandbox$ getent hosts inference.local
sandbox$ curl -sS https://inference.local/v1/models
curl: (6) Could not resolve host: inference.local

[sandbox agent.log on @-mention before workaround]
2026-06-03 11:52:25,785 ERROR [20260603_115224_6754050c] root: Non-retryable client error: Error code: 401 - {'error': {'message': "LiteLLM Virtual Key expected. Received=no-k****ired, expected to start with 'sk-'.", 'type': 'auth_error', 'param': 'None', 'code': '401'}}

[host process check]
$ ls ~/.nemoclaw/
onboard-session.json  rebuild-backups  sandboxes.json  source  usage-notice.json
$ ps -ef | grep -iE "(model.?router|inference-proxy)" | grep -v grep
(empty)
$ ss -tlnp | grep -E "(127.0.0.1:4000|model-router)"
(empty)

[sandbox agent.log after workaround — for contrast]
2026-06-03 12:07:15,237 INFO gateway.run: inbound message: platform=slack user=Lynn Hu chat=C0B741YC2T0 msg='hi from clean start'
2026-06-03 12:07:23,180 INFO run_agent: API call #1: model=aws/anthropic/bedrock-claude-opus-4-6 provider=custom in=16706 out=25 total=16731 latency=3.0s
2026-06-03 12:07:23,590 INFO gateway.run: response ready: platform=slack chat=C0B741YC2T0 time=8.4s api_calls=1 response=68 chars

[related closed bug — context, not duplicate]
NVB 6244574 [Brev][Inference][GitHub Issue #4564] Model Router inference.local returns "inference service unavailable"... — Closed 2026-06-03 02:05 (Bug - Fixed, QA - Closed - Verified). Same area, reopens on a different platform / non-interactive path.

NVB#6263823

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA Teamintegration: hermesHermes integration behaviorplatform: ubuntuAffects Ubuntu Linux environmentsprovider: openaiOpenAI API or OpenAI-compatible provider behavior

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions