Skip to content

fix(onboard): use 127.0.0.1 instead of localhost for local inference …#1716

Closed
chandler-barlow wants to merge 7 commits into
mainfrom
fix/h6-vllm-localhost-bind
Closed

fix(onboard): use 127.0.0.1 instead of localhost for local inference …#1716
chandler-barlow wants to merge 7 commits into
mainfrom
fix/h6-vllm-localhost-bind

Conversation

@chandler-barlow

@chandler-barlow chandler-barlow commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Rationale

*Addressing review and resolved conflicts to close this pr

Using localhost can sometimes resolve to IPv6 addresses or behave inconsistently across different environments and network configurations.

Changes

Switched to the explicit IPv4 loopback address 127.0.0.1 for local inference service detection to improve reliability.

Verification Results

  • Automated Tests: Passed all 52 core tests via npm test.
  • Manual Audit: Verified connection reliability with 127.0.0.1.
  • Security Review: Verified no sensitive data leaks and correct permission enforcement.

Leading Standards

This PR follows the project's 'First Principles' approach, prioritizing deterministic behavior and zero-trust security defaults.

Summary by CodeRabbit

  • Bug Fixes

    • Improved local AI provider (Ollama, vLLM, NVIDIA) detection reliability by standardizing loopback addressing to 127.0.0.1 and updating related health-check messaging.
  • Refactor

    • Consolidated and centralized local provider health-check and probe logic for more consistent detection, onboarding, and user-facing diagnostics.

Signed-off-by: Chandler Barlow cbarlow@nvidia.com

@coderabbitai

coderabbitai Bot commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 811c509a-63cd-4106-a5ff-8e9e6b0946cd

📥 Commits

Reviewing files that changed from the base of the PR and between d3ace3c and e6d5be1.

📒 Files selected for processing (3)
  • src/lib/local-inference.test.ts
  • src/lib/local-inference.ts
  • test/onboard-selection.test.ts
✅ Files skipped from review due to trivial changes (3)
  • src/lib/local-inference.ts
  • src/lib/local-inference.test.ts
  • test/onboard-selection.test.ts

📝 Walkthrough

Walkthrough

Replaced localhost with loopback 127.0.0.1 for Ollama and vLLM local endpoints and test expectations. Also refactored onboarding to use getLocalProviderHealthCheck instead of inline curl commands for local provider detection.

Changes

Cohort / File(s) Summary
Local provider networking
src/lib/local-inference.ts, src/lib/local-inference.test.ts
Swapped http://localhosthttp://127.0.0.1 for vLLM and Ollama base URLs, health endpoints, validation messages, and generated curl/URL strings used in probes and diagnostics.
Onboarding setup
src/lib/onboard.ts
Imported getLocalProviderHealthCheck and replaced direct curl calls with runCapture(getLocalProviderHealthCheck(...), { ignoreError: true }) for Ollama and vLLM health/model probing.
Onboarding tests
test/onboard-selection.test.ts
Updated test expectations and stubs to match 127.0.0.1 URLs (Ollama and vLLM) and renamed one test description to reference 127.0.0.1.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I sniffed the network, small and spry,

localhost hopped — I waved goodbye.
127 curled in a cozy loop,
Health checks hum, a tidy troupe.
Hooray, the burrow’s checks are neat! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: switching from localhost to 127.0.0.1 for local inference services across multiple files and test cases.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/h6-vllm-localhost-bind

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/onboard-selection.test.ts (1)

680-683: ⚠️ Potential issue | 🟡 Minor

Update stale localhost mock literals in this test case.

Line 680 and Line 682 still match localhost, but the local inference commands now target 127.0.0.1. This can miss runCapture matches and make this test path less reliable.

✅ Suggested patch
-  if (command.includes("localhost:11434/api/tags")) return JSON.stringify({ models: [{ name: "nemotron-3-nano:30b" }] });
+  if (command.includes("127.0.0.1:11434/api/tags")) return JSON.stringify({ models: [{ name: "nemotron-3-nano:30b" }] });
   if (command.includes("ollama list")) return "nemotron-3-nano:30b  abc  24 GB  now";
-  if (command.includes("localhost:8000/v1/models")) return "";
+  if (command.includes("127.0.0.1:8000/v1/models")) return "";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard-selection.test.ts` around lines 680 - 683, Update the two stale
localhost mocks so the test matches the current local-inference addresses:
change the checks that call command.includes("localhost:11434/api/tags") and
command.includes("localhost:8000/v1/models") to recognize 127.0.0.1 (either by
replacing the literal or adding an OR check to also match
"127.0.0.1:11434/api/tags" and "127.0.0.1:8000/v1/models"); this ensures
runCapture-based matching (and the if branches that return model/tag mocks)
still fire for local inference.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@test/onboard-selection.test.ts`:
- Around line 680-683: Update the two stale localhost mocks so the test matches
the current local-inference addresses: change the checks that call
command.includes("localhost:11434/api/tags") and
command.includes("localhost:8000/v1/models") to recognize 127.0.0.1 (either by
replacing the literal or adding an OR check to also match
"127.0.0.1:11434/api/tags" and "127.0.0.1:8000/v1/models"); this ensures
runCapture-based matching (and the if branches that return model/tag mocks)
still fire for local inference.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 62695172-fe3d-4aee-82a6-58cd7fdf93a0

📥 Commits

Reviewing files that changed from the base of the PR and between 9d3f9cc and 2460a02.

📒 Files selected for processing (4)
  • bin/lib/onboard.js
  • src/lib/local-inference.test.ts
  • src/lib/local-inference.ts
  • test/onboard-selection.test.ts
💤 Files with no reviewable changes (1)
  • bin/lib/onboard.js

@chandler-barlow chandler-barlow requested a review from cv April 10, 2026 00:18
@chandler-barlow chandler-barlow self-assigned this Apr 10, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/onboard.ts (1)

2644-2649: Add a defensive null-guard before runCapture on helper output.

getLocalProviderHealthCheck(...) can return null; guarding it avoids fragile runtime behavior if provider keys evolve later.

♻️ Proposed hardening patch
-  const ollamaRunning = !!runCapture(getLocalProviderHealthCheck("ollama-local"), {
-    ignoreError: true,
-  });
-  const vllmRunning = !!runCapture(getLocalProviderHealthCheck("vllm-local"), {
-    ignoreError: true,
-  });
+  const ollamaHealthCheck = getLocalProviderHealthCheck("ollama-local");
+  const vllmHealthCheck = getLocalProviderHealthCheck("vllm-local");
+  const ollamaRunning = !!(
+    ollamaHealthCheck &&
+    runCapture(ollamaHealthCheck, {
+      ignoreError: true,
+    })
+  );
+  const vllmRunning = !!(
+    vllmHealthCheck &&
+    runCapture(vllmHealthCheck, {
+      ignoreError: true,
+    })
+  );
@@
-        const vllmModelsRaw = runCapture(getLocalProviderHealthCheck("vllm-local"), {
-          ignoreError: true,
-        });
+        const vllmHealthCheck = getLocalProviderHealthCheck("vllm-local");
+        if (!vllmHealthCheck) {
+          console.error("  Local vLLM health check command is unavailable.");
+          process.exit(1);
+        }
+        const vllmModelsRaw = runCapture(vllmHealthCheck, {
+          ignoreError: true,
+        });

Also applies to: 3173-3175

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/onboard.ts` around lines 2644 - 2649, The code calls
runCapture(getLocalProviderHealthCheck("ollama-local"), ...) and
runCapture(getLocalProviderHealthCheck("vllm-local"), ...) without guarding that
getLocalProviderHealthCheck(...) may return null; add a defensive null-check so
you only call runCapture when getLocalProviderHealthCheck(...) returns a
non-null value (e.g., compute const ollamaCheck =
getLocalProviderHealthCheck("ollama-local") and only call
runCapture(ollamaCheck, ...) if ollamaCheck is not null, similarly for
vllmCheck), and apply the same null-guard pattern to the other occurrence around
lines 3173-3175.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/lib/onboard.ts`:
- Around line 2644-2649: The code calls
runCapture(getLocalProviderHealthCheck("ollama-local"), ...) and
runCapture(getLocalProviderHealthCheck("vllm-local"), ...) without guarding that
getLocalProviderHealthCheck(...) may return null; add a defensive null-check so
you only call runCapture when getLocalProviderHealthCheck(...) returns a
non-null value (e.g., compute const ollamaCheck =
getLocalProviderHealthCheck("ollama-local") and only call
runCapture(ollamaCheck, ...) if ollamaCheck is not null, similarly for
vllmCheck), and apply the same null-guard pattern to the other occurrence around
lines 3173-3175.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7cea2778-203f-4010-a89b-58dfdb58bfb0

📥 Commits

Reviewing files that changed from the base of the PR and between 6f18f8c and d2b1640.

📒 Files selected for processing (2)
  • src/lib/onboard.ts
  • src/lib/preflight.test.ts

@wscurran wscurran added bug Something fails against expected or documented behavior fix labels Apr 10, 2026
@wscurran

Copy link
Copy Markdown
Contributor

✨ Thanks for submitting this PR, which proposes a fix for a bug with local inference service detection and may improve the overall reliability of the service .

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/local-inference.ts (1)

162-208: ⚠️ Potential issue | 🟠 Major

Probe target and failure messages are inconsistent.

validateLocalProvider() (Line 162) still executes a health check built from getLocalProviderHealthEndpoint(), which currently returns localhost (Line 68, Line 70), but the new messages on Line 173, Line 179, Line 201, and Line 207 claim checks against 127.0.0.1. This can produce false negatives and misleading remediation text.

🔧 Proposed fix
 export function getLocalProviderHealthEndpoint(provider: string): string | null {
   switch (provider) {
     case "vllm-local":
-      return "http://localhost:8000/v1/models";
+      return "http://127.0.0.1:8000/v1/models";
     case "ollama-local":
-      return "http://localhost:11434/api/tags";
+      return "http://127.0.0.1:11434/api/tags";
     default:
       return null;
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/local-inference.ts` around lines 162 - 208, The health-check messages
in validateLocalProvider() are inconsistent with the actual probe target
returned by getLocalProviderHealthEndpoint(); update validateLocalProvider() to
derive the host/URL from getLocalProviderHealthEndpoint(provider) (or from
getLocalProviderHealthCheck(provider)'s command) and interpolate that value into
the failure messages for vllm-local and ollama-local instead of hardcoding
"127.0.0.1" (or alternatively change getLocalProviderHealthEndpoint() to return
127.0.0.1 if you intend to probe that address); adjust the
container-reachability messages similarly so they reference the same computed
host/URL used by getLocalProviderContainerReachabilityCheck(provider) and
runCapture.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/lib/local-inference.ts`:
- Around line 162-208: The health-check messages in validateLocalProvider() are
inconsistent with the actual probe target returned by
getLocalProviderHealthEndpoint(); update validateLocalProvider() to derive the
host/URL from getLocalProviderHealthEndpoint(provider) (or from
getLocalProviderHealthCheck(provider)'s command) and interpolate that value into
the failure messages for vllm-local and ollama-local instead of hardcoding
"127.0.0.1" (or alternatively change getLocalProviderHealthEndpoint() to return
127.0.0.1 if you intend to probe that address); adjust the
container-reachability messages similarly so they reference the same computed
host/URL used by getLocalProviderContainerReachabilityCheck(provider) and
runCapture.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8899ac8e-04c9-4cef-aa09-d5fa939b53b1

📥 Commits

Reviewing files that changed from the base of the PR and between d2b1640 and d3ace3c.

📒 Files selected for processing (4)
  • src/lib/local-inference.test.ts
  • src/lib/local-inference.ts
  • src/lib/onboard.ts
  • test/onboard-selection.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/onboard-selection.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/lib/local-inference.test.ts
  • src/lib/onboard.ts

@ahunnargikar-nvidia ahunnargikar-nvidia added platform: ubuntu Affects Ubuntu Linux environments security Potential vulnerability, unsafe behavior, or access risk labels Apr 14, 2026
@wscurran

Copy link
Copy Markdown
Contributor

Thanks for this fix — binding local inference to 127.0.0.1 instead of localhost is a meaningful correctness improvement since localhost resolution may hit IPv6 on some Ubuntu configurations and cause connection failures. The codebase has changed significantly since this was opened, including a TypeScript migration (#1673) that restructured the onboarding and inference routing layers. Could you rebase onto the current main?

prekshivyas added a commit that referenced this pull request Apr 16, 2026
…detection

Using `localhost` can resolve to IPv6 (::1) on some systems, causing
health checks and probe commands to fail. Replace all host-side
localhost references with explicit 127.0.0.1 for deterministic IPv4
behavior.

Changes:
- getLocalProviderValidationBaseUrl: localhost → 127.0.0.1
- getLocalProviderHealthEndpoint: localhost → 127.0.0.1
- validateLocalProvider error messages: localhost → 127.0.0.1
- getOllamaModelOptions curl: localhost → 127.0.0.1
- getOllamaWarmupCommand: localhost → 127.0.0.1
- getOllamaProbeCommand: localhost → 127.0.0.1
- onboard.ts detection curls: localhost → 127.0.0.1
- All test expectations updated to match

Based on the original work in #1716 by @chandler-barlow, re-applied on
the current TypeScript codebase after the JS→TS migration made the
original PR unrebaseable.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-Authored-By: Chandler Barlow <cbarlow@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@prekshivyas

Copy link
Copy Markdown
Contributor

Thanks for this fix @chandler-barlow — the localhost127.0.0.1 change is the right call for IPv6 consistency.

The codebase has changed significantly since this was opened (JS→TS migration in #1673, auth proxy in #1922), making a rebase impractical. We've re-applied your change on current main in #1978 with full attribution (Co-Authored-By: Chandler Barlow <cbarlow@nvidia.com>). The new PR covers the same files and the same localhost127.0.0.1 replacements, updated to match the current TypeScript codebase.

Closing this one in favor of #1978. Appreciate the contribution!

brandonpelfrey pushed a commit that referenced this pull request Apr 16, 2026
…detection (#1978)

## Summary

Supersedes #1716.

Using `localhost` can resolve to IPv6 (`::1`) on some systems, causing
local inference health checks, probe commands, and model queries to fail
silently. This replaces all host-side `localhost` references with
explicit `127.0.0.1` for deterministic IPv4 behavior.

Based on the original work in #1716 by @chandler-barlow — the JS→TS
migration (#1673) made the original PR unrebaseable, so the change has
been re-applied on the current codebase with attribution.

## Changes

- `src/lib/local-inference.ts`: validation URLs, health endpoints, error
messages, model queries, warmup/probe commands
- `src/lib/nim.ts`: NIM health check curls
- `src/lib/onboard.ts`: Ollama and vLLM detection curls
- `src/lib/local-inference.test.ts`: updated expectations
- `src/lib/nim.test.ts`: updated mock matchers
- `test/cli.test.ts`: updated status output assertion
- `test/onboard-selection.test.ts`: updated mock matchers
- `test/e2e/test-gpu-e2e.sh`: host-side Ollama curls

## Test plan

- [x] 50 unit tests passing (local-inference + nim)
- [x] Zero `http://localhost:11434` or `http://localhost:8000`
references remain in src/ test/ scripts/
- [x] Dashboard URLs intentionally unchanged (different scope)

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved reliability of local inference and onboarding checks by
standardizing local service addresses to the loopback IP.

* **Tests**
* Updated unit, integration, and end-to-end test expectations and
scripts to match the adjusted loopback addressing and health-check
behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-authored-by: Chandler Barlow <cbarlow@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wscurran wscurran added the area: local-models Local model providers, downloads, launch, or connectivity label Jun 3, 2026
@wscurran wscurran added area: providers Inference provider integrations and provider behavior bug-fix PR fixes a bug or regression needs: rebase PR needs rebase or conflict resolution and removed Local Models bug Something fails against expected or documented behavior labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: local-models Local model providers, downloads, launch, or connectivity area: providers Inference provider integrations and provider behavior bug-fix PR fixes a bug or regression needs: rebase PR needs rebase or conflict resolution platform: ubuntu Affects Ubuntu Linux environments security Potential vulnerability, unsafe behavior, or access risk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants