fix(onboard): use 127.0.0.1 instead of localhost for local inference detection by prekshivyas · Pull Request #1978 · NVIDIA/NemoClaw

prekshivyas · 2026-04-16T17:49:13Z

Summary

Supersedes #1716.

Using localhost can resolve to IPv6 (::1) on some systems, causing local inference health checks, probe commands, and model queries to fail silently. This replaces all host-side localhost references with explicit 127.0.0.1 for deterministic IPv4 behavior.

Based on the original work in #1716 by @chandler-barlow — the JS→TS migration (#1673) made the original PR unrebaseable, so the change has been re-applied on the current codebase with attribution.

Changes

src/lib/local-inference.ts: validation URLs, health endpoints, error messages, model queries, warmup/probe commands
src/lib/nim.ts: NIM health check curls
src/lib/onboard.ts: Ollama and vLLM detection curls
src/lib/local-inference.test.ts: updated expectations
src/lib/nim.test.ts: updated mock matchers
test/cli.test.ts: updated status output assertion
test/onboard-selection.test.ts: updated mock matchers
test/e2e/test-gpu-e2e.sh: host-side Ollama curls

Test plan

50 unit tests passing (local-inference + nim)
Zero http://localhost:11434 or http://localhost:8000 references remain in src/ test/ scripts/
Dashboard URLs intentionally unchanged (different scope)

Signed-off-by: Prekshi Vyas prekshiv@nvidia.com

Summary by CodeRabbit

Bug Fixes
- Improved reliability of local inference and onboarding checks by standardizing local service addresses to the loopback IP.
Tests
- Updated unit, integration, and end-to-end test expectations and scripts to match the adjusted loopback addressing and health-check behavior.

@chandler-barlow

…detection Using `localhost` can resolve to IPv6 (::1) on some systems, causing health checks and probe commands to fail. Replace all host-side localhost references with explicit 127.0.0.1 for deterministic IPv4 behavior. Changes: - getLocalProviderValidationBaseUrl: localhost → 127.0.0.1 - getLocalProviderHealthEndpoint: localhost → 127.0.0.1 - validateLocalProvider error messages: localhost → 127.0.0.1 - getOllamaModelOptions curl: localhost → 127.0.0.1 - getOllamaWarmupCommand: localhost → 127.0.0.1 - getOllamaProbeCommand: localhost → 127.0.0.1 - onboard.ts detection curls: localhost → 127.0.0.1 - All test expectations updated to match Based on the original work in #1716 by @chandler-barlow, re-applied on the current TypeScript codebase after the JS→TS migration made the original PR unrebaseable. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Chandler Barlow <cbarlow@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-16T17:52:44Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 35e141b9-2cc8-4b69-974a-5dea044cf350

📥 Commits

Reviewing files that changed from the base of the PR and between 8a1a039 and 33b51bf.

📒 Files selected for processing (1)

test/e2e/test-gpu-e2e.sh

🚧 Files skipped from review as they are similar to previous changes (1)

test/e2e/test-gpu-e2e.sh

📝 Walkthrough

Walkthrough

Replaced http://localhost with http://127.0.0.1 across local inference code, onboarding logic, NIM probes, tests, and an e2e script for Ollama and vLLM-related health, model, and generate endpoints. No API signatures or control flow were changed.

Changes

Cohort / File(s)	Summary
Local inference & onboarding `src/lib/local-inference.ts`, `src/lib/onboard.ts`	Switched local service URLs from `http://localhost` to `http://127.0.0.1` for Ollama (`/api/tags`, `/api/generate`) and vLLM (`/v1/models`) checks and probes.
NIM probes `src/lib/nim.ts`, `src/lib/nim.test.ts`	Updated NIM health-check curl targets and test expectations to use `http://127.0.0.1:<port>/v1/models`.
Tests — local & onboarding selection `src/lib/local-inference.test.ts`, `test/onboard-selection.test.ts`, `test/cli.test.ts`	Adjusted mocked endpoints, assertions, and descriptions to expect `http://127.0.0.1:<port>` for Ollama and OpenAI-compatible local model endpoints.
End-to-end script `test/e2e/test-gpu-e2e.sh`	Replaced `localhost` with `127.0.0.1` in curl checks, model discovery, and inference requests; updated a user-facing log message accordingly.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I hopped to 127.0.0.1’s gate,
Ollama woke — the tags looked great,
vLLM chimed a tiny tune,
Tests now ping the loopback rune,
A snug change, and I celebrate. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: replacing localhost with 127.0.0.1 for local inference detection, which is the primary objective across all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/localhost-to-127001

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Missed in the initial commit: - src/lib/nim.ts: health check curls used localhost - src/lib/nim.test.ts: test mocks matched localhost - test/cli.test.ts: status output assertion expected localhost - test/e2e/test-gpu-e2e.sh: host-side Ollama curls used localhost Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Chandler Barlow <cbarlow@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

test/e2e/test-gpu-e2e.sh (1)

553-553: Update stale summary text to match the new address choice.

The final summary still says “starts Ollama on localhost”, which now conflicts with the script’s 127.0.0.1 usage and updated phase logs.

Suggested text-only tweak

-echo "    - Onboard: starts Ollama on localhost, starts auth proxy, pulls model, creates sandbox"
+echo "    - Onboard: starts Ollama on 127.0.0.1, starts auth proxy, pulls model, creates sandbox"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-gpu-e2e.sh` at line 553, Update the summary string emitted by
the script to reflect the actual address used; replace the phrase "starts Ollama
on localhost" in the echoed message (the echo call that prints "    - Onboard:
starts Ollama on localhost, starts auth proxy, pulls model, creates sandbox") so
it references "127.0.0.1" (or a neutral term like "loopback 127.0.0.1") to match
the script’s use of 127.0.0.1 and the updated phase logs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/test-gpu-e2e.sh`:
- Line 553: Update the summary string emitted by the script to reflect the
actual address used; replace the phrase "starts Ollama on localhost" in the
echoed message (the echo call that prints "    - Onboard: starts Ollama on
localhost, starts auth proxy, pulls model, creates sandbox") so it references
"127.0.0.1" (or a neutral term like "loopback 127.0.0.1") to match the script’s
use of 127.0.0.1 and the updated phase logs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4c8fb16d-8c56-4dab-ae18-e06421ffbae4

📥 Commits

Reviewing files that changed from the base of the PR and between 384db10 and 8a1a039.

📒 Files selected for processing (4)

src/lib/nim.test.ts
src/lib/nim.ts
test/cli.test.ts
test/e2e/test-gpu-e2e.sh

✅ Files skipped from review due to trivial changes (2)

src/lib/nim.ts
test/cli.test.ts

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

brandonpelfrey · 2026-04-16T18:28:31Z

Automated PR review summary

Reviewed PR #1978: fix(onboard): use 127.0.0.1 instead of localhost for local inference detection

Recommendation

Recommendation: PASS
Highest observed severity: low
Block merge: no
Why: If this PR were wrong, IPv4-only local Ollama/vLLM setups could still be misdetected or users would see misleading localhost guidance. The focused probes I ran showed the changed helper and onboarding inference paths now target 127.0.0.1 as claimed, and I did not find a remaining localhost reference in the exact inference call sites modified by the PR.
Reviewer summary: Reviewed PR fix(onboard): use 127.0.0.1 instead of localhost for local inference detection #1978 with targeted adversarial probes against the installed build plus sandbox access verification. I found the claimed localhost→127.0.0.1 inference-path replacements present and behaving consistently in the changed helpers.

Installation and setup findings

Installation from the local checkout succeeded. I used the repo root install.sh with NEMOCLAW_REPO_ROOT pointing at /workspace/nemoclaw so the installer built and linked from source. The first onboarding attempt brought up the OpenShell gateway but failed at NVIDIA Endpoints validation; retrying/resuming with the build credential exported as NVIDIA_API_KEY succeeded, created sandbox 'my-assistant', and configured provider nvidia-prod with model nvidia/nemotron-3-super-120b-a12b. Verification succeeded: nemoclaw list/status showed the sandbox, SSH into the existing NemoClaw-managed sandbox returned '2+2=4', and an in-sandbox openclaw agent one-shot probe returned '4'.

What was validated

The PR revision was checked out in an isolated review environment.
The local checkout was installed using the repository installer flow as closely as the environment allowed.
Adversarial, PR-specific probes were then run against the installed environment and relevant repository context.
Diff summary:

 src/lib/local-inference.test.ts | 26 +++++++++++++-------------
 src/lib/local-inference.ts      | 22 +++++++++++-----------
 src/lib/onboard.ts              |  6 +++---
 test/onboard-selection.test.ts  | 40 ++++++++++++++++++++--------------------
 4 files changed, 47 insertions(+), 47 deletions(-)

Failing tests and unresolved impact

No failing adversarial tests were captured.

Passing tests and why they mattered

Passing test 1: Local provider validation now probes 127.0.0.1 for Ollama

What was tested: validateLocalProvider('ollama-local') now issues its host-side health probe to 127.0.0.1 and surfaces 127.0.0.1 in failure guidance.
Why it mattered: If false, IPv4-only local Ollama setups can still be misdetected or users get misleading localhost guidance.
Observed result: validateLocalProvider invoked curl -sf http://127.0.0.1:11434/api/tags and returned the expected container-reachability failure message referencing 'responding on 127.0.0.1'.
Command: node /tmp/pr1978-probe1.js
Recommended follow-up coverage: Keep/add regression tests asserting both the host probe command and the container-reachability failure message use 127.0.0.1 for ollama-local.

Passing test 2: Ollama model query, warmup, and probe helpers consistently use 127.0.0.1

What was tested: Secondary local-inference helper paths changed by this PR no longer emit localhost URLs.
Why it mattered: If false, onboarding might detect correctly but later warmup/probe/model-list actions could still fail on IPv6-preferring localhost systems.
Observed result: getOllamaModelOptions hit curl -sf http://127.0.0.1:11434/api/tags; getOllamaWarmupCommand and getOllamaProbeCommand emitted http://127.0.0.1:11434/api/generate.
Command: node /tmp/pr1978-probe2.js
Recommended follow-up coverage: Add/keep unit coverage for warmup and probe helper output strings, since helper-command regressions are easy to reintroduce during refactors.

Passing test 3: vLLM validation URLs and curl-based health probe use 127.0.0.1

What was tested: vLLM validation and probe detail generation now target 127.0.0.1:8000 rather than localhost.
Why it mattered: If false, vLLM onboarding/health checks still break on hosts where localhost resolves to ::1 first.
Observed result: Validation URL, health endpoint, injected curl probe args, and failure detail all referenced http://127.0.0.1:8000/v1 or /v1/models.
Command: node /tmp/pr1978-probe3.js
Recommended follow-up coverage: Keep as unit/regression coverage with assertions on both getter output and probe-detail text for vllm-local.

Passing test 4: Changed inference paths in onboard/local-inference contain the expected 127.0.0.1 replacements

What was tested: The PR actually replaced localhost with 127.0.0.1 in the inference-specific call sites it claims, without leaving one of the changed paths behind.
Why it mattered: If false, the PR description overstates the fix and one remaining localhost path could still fail in IPv6-first environments.
Observed result: The modified detection/query call sites in onboard.ts and local-inference.ts use 127.0.0.1; remaining localhost mentions in the excerpt were comments, dashboard scope, or user-facing labels outside the exact changed inference probes.
Command: grep -n "127.0.0.1\|localhost" src/lib/onboard.ts src/lib/local-inference.ts | sed -n '1,80p'
Recommended follow-up coverage: No extra grep-style automated test needed; targeted unit/regression tests around each helper/call site are the better guardrail.

Bottom line

Based on the install evidence and adversarial probes, this PR looks reasonable to approve.

Refresh user-facing docs against the 34 commits merged between v0.0.17 and v0.0.18. Highlights: - Replace the Ollama 0.0.0.0 binding guidance with the new authenticated reverse proxy on 127.0.0.1:11435 (#1922). - Document the compatible-endpoint provider defaulting to /v1/chat/completions and the NEMOCLAW_PREFERRED_API=openai-responses opt-in (#1984). - Add the new nemoclaw upgrade-sandboxes command with --check, --auto, and --yes flags (#1943). - Note the cross-sandbox messaging overlap warning and 409 detection in nemoclaw <name> status (#1953). - Document the messaging-token rotation auto-rebuild flow (#1967). - Cover new troubleshooting entries for the Ollama auth proxy, IPv6 localhost resolution, orphan SSH port-forward cleanup on re-onboard, and rotated messaging credentials (#1978, #1950). - Note tar failure exit code for nemoclaw debug --output (#1770) and the orphaned openshell process cleanup in nemoclaw uninstall (#1940). Also: - Extend docs/.docs-skip to exclude the experimental sandbox-mgmt shields and config commands (#1976). - Fix a sphinx-autobuild infinite rebuild loop in docs/conf.py by writing docs/project.json only when its contents change. - Bump the docs version switcher preferred entry to 0.0.18. - Regenerate nemoclaw-user-* agent skills from docs/. Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Made-with: Cursor

## Summary Refresh user-facing documentation against the 34 commits merged between v0.0.17 and v0.0.18, bump the docs version switcher to v0.0.18, and fix a `sphinx-autobuild` infinite-rebuild loop triggered by `docs/conf.py`. ## Changes - **Ollama authenticated reverse proxy** (#1922): Replace the `0.0.0.0:11434` guidance in `docs/inference/use-local-inference.md` with the new token-gated proxy on `127.0.0.1:11435`, including persisted token, health-check exemption, and sandbox provider wiring. Replace the matching troubleshooting entry in `docs/reference/troubleshooting.md`. - **Compatible-endpoint default API path** (#1984): Document that the compatible-endpoint provider now defaults to `/v1/chat/completions` and update `NEMOCLAW_PREFERRED_API` to describe `openai-responses` as the opt-in instead of `openai-completions`. Updates in `use-local-inference.md`, `switch-inference-providers.md`, and `troubleshooting.md`. - **`nemoclaw upgrade-sandboxes` command** (#1943): Add a new reference entry in `docs/reference/commands.md` covering `--check`, `--auto`, and `--yes` flags. - **Messaging token rotation auto-rebuild** (#1967, #1953): Note the automatic rebuild behavior and cross-sandbox overlap warning in `docs/deployment/set-up-telegram-bridge.md`, `commands.md`, and `troubleshooting.md`. - **Other troubleshooting additions**: - `localhost` → `127.0.0.1` IPv6 note (#1978) - Orphan SSH port-forward cleanup on re-onboard (#1950) - Orphan `openshell` process cleanup in `nemoclaw uninstall` (#1940) - Non-zero exit on tar failure in `nemoclaw debug --output` (#1770) - **Skip list**: Extend `docs/.docs-skip` to exclude the experimental sandbox-mgmt shields and config commands feature (#1976), which was explicitly merged as not-yet-documented. - **Build stability**: `docs/conf.py` now writes `docs/project.json` only when contents change, so `make docs-live` / `sphinx-autobuild` no longer detects its own generated file as a source change and enters an infinite rebuild loop. - **Version switcher**: Bump `docs/versions1.json` and `docs/project.json` preferred entry to v0.0.18 so this refresh renders under the new version. - **Agent skills**: Regenerate `nemoclaw-user-*` skills from `docs/` with `scripts/docs-to-skills.py`. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [x] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes (ran via pre-commit hook on staged files) - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) ## AI Disclosure - [x] AI-assisted — tool: Cursor --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Made with [Cursor](https://cursor.com)  ## Summary by CodeRabbit ## Release Notes * **New Features** * Added `nemoclaw upgrade-sandboxes` command to rebuild sandboxes when base-image digests change. * Introduced authenticated reverse proxy for local Ollama inference with token-based access control. * Automatic sandbox backup, recreation, and restore when messaging credentials are updated. * Cross-sandbox messaging token overlap detection with status warnings. * **Improvements** * Compatible-endpoint provider now defaults to `/v1/chat/completions` API path. * Enhanced troubleshooting documentation with new diagnostics sections. * **Documentation** * Updated onboarding and configuration guides. * Expanded version documentation to 0.0.18.  Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

NVIDIA#1978) The argv migration kept the old localhost URLs for Ollama and vLLM curl probes, but main switched to 127.0.0.1 in commit dbc805c. The onboard-selection tests mock 127.0.0.1, so the mismatch caused two test failures.

prekshivyas mentioned this pull request Apr 16, 2026

fix(onboard): use 127.0.0.1 instead of localhost for local inference … #1716

Closed

3 tasks

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

prekshivyas and others added 2 commits April 16, 2026 11:02

Merge branch 'main' into fix/localhost-to-127001

083de88

fix: update gpu-e2e summary text to say 127.0.0.1 instead of localhost

33b51bf

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

prekshivyas requested review from brandonpelfrey and cv April 16, 2026 18:05

brandonpelfrey approved these changes Apr 16, 2026

View reviewed changes

brandonpelfrey merged commit dbc805c into main Apr 16, 2026
14 checks passed

miyoungc mentioned this pull request Apr 17, 2026

docs: catch up documentation for v0.0.18 changes #2033

Merged

13 tasks

wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): use 127.0.0.1 instead of localhost for local inference detection#1978

fix(onboard): use 127.0.0.1 instead of localhost for local inference detection#1978
brandonpelfrey merged 4 commits into
mainfrom
fix/localhost-to-127001

prekshivyas commented Apr 16, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

brandonpelfrey commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

prekshivyas commented Apr 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

brandonpelfrey commented Apr 16, 2026

Automated PR review summary

Recommendation

Installation and setup findings

What was validated

Failing tests and unresolved impact

Passing tests and why they mattered

Passing test 1: Local provider validation now probes 127.0.0.1 for Ollama

Passing test 2: Ollama model query, warmup, and probe helpers consistently use 127.0.0.1

Passing test 3: vLLM validation URLs and curl-based health probe use 127.0.0.1

Passing test 4: Changed inference paths in onboard/local-inference contain the expected 127.0.0.1 replacements

Bottom line

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

prekshivyas commented Apr 16, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading