Skip to content

fix(onboard): use 127.0.0.1 instead of localhost for local inference detection#1978

Merged
brandonpelfrey merged 4 commits into
mainfrom
fix/localhost-to-127001
Apr 16, 2026
Merged

fix(onboard): use 127.0.0.1 instead of localhost for local inference detection#1978
brandonpelfrey merged 4 commits into
mainfrom
fix/localhost-to-127001

Conversation

@prekshivyas

@prekshivyas prekshivyas commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Summary

Supersedes #1716.

Using localhost can resolve to IPv6 (::1) on some systems, causing local inference health checks, probe commands, and model queries to fail silently. This replaces all host-side localhost references with explicit 127.0.0.1 for deterministic IPv4 behavior.

Based on the original work in #1716 by @chandler-barlow — the JS→TS migration (#1673) made the original PR unrebaseable, so the change has been re-applied on the current codebase with attribution.

Changes

  • src/lib/local-inference.ts: validation URLs, health endpoints, error messages, model queries, warmup/probe commands
  • src/lib/nim.ts: NIM health check curls
  • src/lib/onboard.ts: Ollama and vLLM detection curls
  • src/lib/local-inference.test.ts: updated expectations
  • src/lib/nim.test.ts: updated mock matchers
  • test/cli.test.ts: updated status output assertion
  • test/onboard-selection.test.ts: updated mock matchers
  • test/e2e/test-gpu-e2e.sh: host-side Ollama curls

Test plan

  • 50 unit tests passing (local-inference + nim)
  • Zero http://localhost:11434 or http://localhost:8000 references remain in src/ test/ scripts/
  • Dashboard URLs intentionally unchanged (different scope)

Signed-off-by: Prekshi Vyas prekshiv@nvidia.com

Summary by CodeRabbit

  • Bug Fixes

    • Improved reliability of local inference and onboarding checks by standardizing local service addresses to the loopback IP.
  • Tests

    • Updated unit, integration, and end-to-end test expectations and scripts to match the adjusted loopback addressing and health-check behavior.

…detection

Using `localhost` can resolve to IPv6 (::1) on some systems, causing
health checks and probe commands to fail. Replace all host-side
localhost references with explicit 127.0.0.1 for deterministic IPv4
behavior.

Changes:
- getLocalProviderValidationBaseUrl: localhost → 127.0.0.1
- getLocalProviderHealthEndpoint: localhost → 127.0.0.1
- validateLocalProvider error messages: localhost → 127.0.0.1
- getOllamaModelOptions curl: localhost → 127.0.0.1
- getOllamaWarmupCommand: localhost → 127.0.0.1
- getOllamaProbeCommand: localhost → 127.0.0.1
- onboard.ts detection curls: localhost → 127.0.0.1
- All test expectations updated to match

Based on the original work in #1716 by @chandler-barlow, re-applied on
the current TypeScript codebase after the JS→TS migration made the
original PR unrebaseable.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Chandler Barlow <cbarlow@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 35e141b9-2cc8-4b69-974a-5dea044cf350

📥 Commits

Reviewing files that changed from the base of the PR and between 8a1a039 and 33b51bf.

📒 Files selected for processing (1)
  • test/e2e/test-gpu-e2e.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/test-gpu-e2e.sh

📝 Walkthrough

Walkthrough

Replaced http://localhost with http://127.0.0.1 across local inference code, onboarding logic, NIM probes, tests, and an e2e script for Ollama and vLLM-related health, model, and generate endpoints. No API signatures or control flow were changed.

Changes

Cohort / File(s) Summary
Local inference & onboarding
src/lib/local-inference.ts, src/lib/onboard.ts
Switched local service URLs from http://localhost to http://127.0.0.1 for Ollama (/api/tags, /api/generate) and vLLM (/v1/models) checks and probes.
NIM probes
src/lib/nim.ts, src/lib/nim.test.ts
Updated NIM health-check curl targets and test expectations to use http://127.0.0.1:<port>/v1/models.
Tests — local & onboarding selection
src/lib/local-inference.test.ts, test/onboard-selection.test.ts, test/cli.test.ts
Adjusted mocked endpoints, assertions, and descriptions to expect http://127.0.0.1:<port> for Ollama and OpenAI-compatible local model endpoints.
End-to-end script
test/e2e/test-gpu-e2e.sh
Replaced localhost with 127.0.0.1 in curl checks, model discovery, and inference requests; updated a user-facing log message accordingly.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I hopped to 127.0.0.1’s gate,
Ollama woke — the tags looked great,
vLLM chimed a tiny tune,
Tests now ping the loopback rune,
A snug change, and I celebrate. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: replacing localhost with 127.0.0.1 for local inference detection, which is the primary objective across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/localhost-to-127001

Comment @coderabbitai help to get the list of available commands and usage tips.

Missed in the initial commit:
- src/lib/nim.ts: health check curls used localhost
- src/lib/nim.test.ts: test mocks matched localhost
- test/cli.test.ts: status output assertion expected localhost
- test/e2e/test-gpu-e2e.sh: host-side Ollama curls used localhost

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Chandler Barlow <cbarlow@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/e2e/test-gpu-e2e.sh (1)

553-553: Update stale summary text to match the new address choice.

The final summary still says “starts Ollama on localhost”, which now conflicts with the script’s 127.0.0.1 usage and updated phase logs.

Suggested text-only tweak
-echo "    - Onboard: starts Ollama on localhost, starts auth proxy, pulls model, creates sandbox"
+echo "    - Onboard: starts Ollama on 127.0.0.1, starts auth proxy, pulls model, creates sandbox"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-gpu-e2e.sh` at line 553, Update the summary string emitted by
the script to reflect the actual address used; replace the phrase "starts Ollama
on localhost" in the echoed message (the echo call that prints "    - Onboard:
starts Ollama on localhost, starts auth proxy, pulls model, creates sandbox") so
it references "127.0.0.1" (or a neutral term like "loopback 127.0.0.1") to match
the script’s use of 127.0.0.1 and the updated phase logs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/test-gpu-e2e.sh`:
- Line 553: Update the summary string emitted by the script to reflect the
actual address used; replace the phrase "starts Ollama on localhost" in the
echoed message (the echo call that prints "    - Onboard: starts Ollama on
localhost, starts auth proxy, pulls model, creates sandbox") so it references
"127.0.0.1" (or a neutral term like "loopback 127.0.0.1") to match the script’s
use of 127.0.0.1 and the updated phase logs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 4c8fb16d-8c56-4dab-ae18-e06421ffbae4

📥 Commits

Reviewing files that changed from the base of the PR and between 384db10 and 8a1a039.

📒 Files selected for processing (4)
  • src/lib/nim.test.ts
  • src/lib/nim.ts
  • test/cli.test.ts
  • test/e2e/test-gpu-e2e.sh
✅ Files skipped from review due to trivial changes (2)
  • src/lib/nim.ts
  • test/cli.test.ts

@brandonpelfrey

Copy link
Copy Markdown
Collaborator

Automated PR review summary

Reviewed PR #1978: fix(onboard): use 127.0.0.1 instead of localhost for local inference detection

Recommendation

  • Recommendation: PASS
  • Highest observed severity: low
  • Block merge: no
  • Why: If this PR were wrong, IPv4-only local Ollama/vLLM setups could still be misdetected or users would see misleading localhost guidance. The focused probes I ran showed the changed helper and onboarding inference paths now target 127.0.0.1 as claimed, and I did not find a remaining localhost reference in the exact inference call sites modified by the PR.
  • Reviewer summary: Reviewed PR fix(onboard): use 127.0.0.1 instead of localhost for local inference detection #1978 with targeted adversarial probes against the installed build plus sandbox access verification. I found the claimed localhost→127.0.0.1 inference-path replacements present and behaving consistently in the changed helpers.

Installation and setup findings

  • Installation from the local checkout succeeded. I used the repo root install.sh with NEMOCLAW_REPO_ROOT pointing at /workspace/nemoclaw so the installer built and linked from source. The first onboarding attempt brought up the OpenShell gateway but failed at NVIDIA Endpoints validation; retrying/resuming with the build credential exported as NVIDIA_API_KEY succeeded, created sandbox 'my-assistant', and configured provider nvidia-prod with model nvidia/nemotron-3-super-120b-a12b. Verification succeeded: nemoclaw list/status showed the sandbox, SSH into the existing NemoClaw-managed sandbox returned '2+2=4', and an in-sandbox openclaw agent one-shot probe returned '4'.

What was validated

  • The PR revision was checked out in an isolated review environment.
  • The local checkout was installed using the repository installer flow as closely as the environment allowed.
  • Adversarial, PR-specific probes were then run against the installed environment and relevant repository context.
  • Diff summary:
 src/lib/local-inference.test.ts | 26 +++++++++++++-------------
 src/lib/local-inference.ts      | 22 +++++++++++-----------
 src/lib/onboard.ts              |  6 +++---
 test/onboard-selection.test.ts  | 40 ++++++++++++++++++++--------------------
 4 files changed, 47 insertions(+), 47 deletions(-)

Failing tests and unresolved impact

  • No failing adversarial tests were captured.

Passing tests and why they mattered

Passing test 1: Local provider validation now probes 127.0.0.1 for Ollama

  • What was tested: validateLocalProvider('ollama-local') now issues its host-side health probe to 127.0.0.1 and surfaces 127.0.0.1 in failure guidance.
  • Why it mattered: If false, IPv4-only local Ollama setups can still be misdetected or users get misleading localhost guidance.
  • Observed result: validateLocalProvider invoked curl -sf http://127.0.0.1:11434/api/tags and returned the expected container-reachability failure message referencing 'responding on 127.0.0.1'.
  • Command: node /tmp/pr1978-probe1.js
  • Recommended follow-up coverage: Keep/add regression tests asserting both the host probe command and the container-reachability failure message use 127.0.0.1 for ollama-local.

Passing test 2: Ollama model query, warmup, and probe helpers consistently use 127.0.0.1

  • What was tested: Secondary local-inference helper paths changed by this PR no longer emit localhost URLs.
  • Why it mattered: If false, onboarding might detect correctly but later warmup/probe/model-list actions could still fail on IPv6-preferring localhost systems.
  • Observed result: getOllamaModelOptions hit curl -sf http://127.0.0.1:11434/api/tags; getOllamaWarmupCommand and getOllamaProbeCommand emitted http://127.0.0.1:11434/api/generate.
  • Command: node /tmp/pr1978-probe2.js
  • Recommended follow-up coverage: Add/keep unit coverage for warmup and probe helper output strings, since helper-command regressions are easy to reintroduce during refactors.

Passing test 3: vLLM validation URLs and curl-based health probe use 127.0.0.1

  • What was tested: vLLM validation and probe detail generation now target 127.0.0.1:8000 rather than localhost.
  • Why it mattered: If false, vLLM onboarding/health checks still break on hosts where localhost resolves to ::1 first.
  • Observed result: Validation URL, health endpoint, injected curl probe args, and failure detail all referenced http://127.0.0.1:8000/v1 or /v1/models.
  • Command: node /tmp/pr1978-probe3.js
  • Recommended follow-up coverage: Keep as unit/regression coverage with assertions on both getter output and probe-detail text for vllm-local.

Passing test 4: Changed inference paths in onboard/local-inference contain the expected 127.0.0.1 replacements

  • What was tested: The PR actually replaced localhost with 127.0.0.1 in the inference-specific call sites it claims, without leaving one of the changed paths behind.
  • Why it mattered: If false, the PR description overstates the fix and one remaining localhost path could still fail in IPv6-first environments.
  • Observed result: The modified detection/query call sites in onboard.ts and local-inference.ts use 127.0.0.1; remaining localhost mentions in the excerpt were comments, dashboard scope, or user-facing labels outside the exact changed inference probes.
  • Command: grep -n "127.0.0.1\|localhost" src/lib/onboard.ts src/lib/local-inference.ts | sed -n '1,80p'
  • Recommended follow-up coverage: No extra grep-style automated test needed; targeted unit/regression tests around each helper/call site are the better guardrail.

Bottom line

  • Based on the install evidence and adversarial probes, this PR looks reasonable to approve.

@brandonpelfrey brandonpelfrey merged commit dbc805c into main Apr 16, 2026
14 checks passed
miyoungc added a commit that referenced this pull request Apr 17, 2026
Refresh user-facing docs against the 34 commits merged between v0.0.17
and v0.0.18. Highlights:

- Replace the Ollama 0.0.0.0 binding guidance with the new authenticated
  reverse proxy on 127.0.0.1:11435 (#1922).
- Document the compatible-endpoint provider defaulting to
  /v1/chat/completions and the NEMOCLAW_PREFERRED_API=openai-responses
  opt-in (#1984).
- Add the new nemoclaw upgrade-sandboxes command with --check, --auto,
  and --yes flags (#1943).
- Note the cross-sandbox messaging overlap warning and 409 detection in
  nemoclaw <name> status (#1953).
- Document the messaging-token rotation auto-rebuild flow (#1967).
- Cover new troubleshooting entries for the Ollama auth proxy, IPv6
  localhost resolution, orphan SSH port-forward cleanup on re-onboard,
  and rotated messaging credentials (#1978, #1950).
- Note tar failure exit code for nemoclaw debug --output (#1770) and the
  orphaned openshell process cleanup in nemoclaw uninstall (#1940).

Also:

- Extend docs/.docs-skip to exclude the experimental sandbox-mgmt
  shields and config commands (#1976).
- Fix a sphinx-autobuild infinite rebuild loop in docs/conf.py by
  writing docs/project.json only when its contents change.
- Bump the docs version switcher preferred entry to 0.0.18.
- Regenerate nemoclaw-user-* agent skills from docs/.

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
Made-with: Cursor
miyoungc added a commit that referenced this pull request Apr 17, 2026
## Summary

Refresh user-facing documentation against the 34 commits merged between
v0.0.17 and v0.0.18, bump the docs version switcher to v0.0.18, and fix
a
`sphinx-autobuild` infinite-rebuild loop triggered by `docs/conf.py`.

## Changes

- **Ollama authenticated reverse proxy** (#1922): Replace the
`0.0.0.0:11434` guidance in `docs/inference/use-local-inference.md` with
the new token-gated proxy on `127.0.0.1:11435`, including persisted
token,
health-check exemption, and sandbox provider wiring. Replace the
matching
  troubleshooting entry in `docs/reference/troubleshooting.md`.
- **Compatible-endpoint default API path** (#1984): Document that the
compatible-endpoint provider now defaults to `/v1/chat/completions` and
  update `NEMOCLAW_PREFERRED_API` to describe `openai-responses` as the
  opt-in instead of `openai-completions`. Updates in
  `use-local-inference.md`, `switch-inference-providers.md`, and
  `troubleshooting.md`.
- **`nemoclaw upgrade-sandboxes` command** (#1943): Add a new reference
entry in `docs/reference/commands.md` covering `--check`, `--auto`, and
  `--yes` flags.
- **Messaging token rotation auto-rebuild** (#1967, #1953): Note the
  automatic rebuild behavior and cross-sandbox overlap warning in
  `docs/deployment/set-up-telegram-bridge.md`, `commands.md`, and
  `troubleshooting.md`.
- **Other troubleshooting additions**:
  - `localhost` → `127.0.0.1` IPv6 note (#1978)
  - Orphan SSH port-forward cleanup on re-onboard (#1950)
  - Orphan `openshell` process cleanup in `nemoclaw uninstall` (#1940)
  - Non-zero exit on tar failure in `nemoclaw debug --output` (#1770)
- **Skip list**: Extend `docs/.docs-skip` to exclude the experimental
  sandbox-mgmt shields and config commands feature (#1976), which was
  explicitly merged as not-yet-documented.
- **Build stability**: `docs/conf.py` now writes `docs/project.json`
only
when contents change, so `make docs-live` / `sphinx-autobuild` no longer
detects its own generated file as a source change and enters an infinite
  rebuild loop.
- **Version switcher**: Bump `docs/versions1.json` and
`docs/project.json`
preferred entry to v0.0.18 so this refresh renders under the new
version.
- **Agent skills**: Regenerate `nemoclaw-user-*` skills from `docs/`
with
  `scripts/docs-to-skills.py`.

## Type of Change

- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification

- [x] `npx prek run --all-files` passes (ran via pre-commit hook on
staged files)
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

## AI Disclosure

- [x] AI-assisted — tool: Cursor

---

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added `nemoclaw upgrade-sandboxes` command to rebuild sandboxes when
base-image digests change.
* Introduced authenticated reverse proxy for local Ollama inference with
token-based access control.
* Automatic sandbox backup, recreation, and restore when messaging
credentials are updated.
* Cross-sandbox messaging token overlap detection with status warnings.

* **Improvements**
* Compatible-endpoint provider now defaults to `/v1/chat/completions`
API path.
* Enhanced troubleshooting documentation with new diagnostics sections.

* **Documentation**
  * Updated onboarding and configuration guides.
  * Expanded version documentation to 0.0.18.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
jyaunches pushed a commit to jyaunches/NemoClaw that referenced this pull request Apr 18, 2026
NVIDIA#1978)

The argv migration kept the old localhost URLs for Ollama and vLLM curl
probes, but main switched to 127.0.0.1 in commit dbc805c. The
onboard-selection tests mock 127.0.0.1, so the mismatch caused two
test failures.
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants