fix: add local-inference policy preset for Ollama/vLLM host access (Fixes #693) by ericksoa · Pull Request #2000 · NVIDIA/NemoClaw

ericksoa · 2026-04-17T03:49:05Z

Summary

New local-inference.yaml preset allows host.openshell.internal on ports 11434 (Ollama) and 8000 (vLLM) with binary restrictions (openclaw, claude).
Onboard auto-suggests this preset when a local provider (ollama-local, vllm-local) is selected.
Supersedes fix: add local-inference policy preset for Ollama/vLLM host access (#693) #781 with improvements listed below.

Improvements over #781

Shared constant: LOCAL_INFERENCE_PROVIDERS used in both getSuggestedPolicyPresets() and setupPoliciesWithSelection() — no duplicated arrays that can drift.
Removed nim-local: Not in the approved providers list in inference-config.ts.
Content-level preset tests: Assert actual hosts, ports, and binary restrictions (not just count/name).
Auto-suggestion tests: Verify local providers get the preset suggested, cloud providers don't.

Credit

Based on work by @cluster2600 in #781. The preset YAML and auto-suggest approach are theirs; this PR refactors and adds test coverage.

Test plan

Preset content tests (hosts, ports, binaries)
Auto-suggestion tests for ollama-local and vllm-local
Negative tests for cloud providers
All 200 existing policy + onboard tests pass
Manual: onboard with ollama-local, verify local-inference is pre-selected

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added local-inference policy preset with network policies for HTTP(S) endpoints supporting GET and POST requests on ports 11434 and 8000.
- Onboarding flow now automatically suggests local-inference policy when using local model providers.

@cluster2600

…ixes #693) New `local-inference.yaml` preset allows `host.openshell.internal` on ports 11434 (Ollama) and 8000 (vLLM) with binary restrictions. Onboard auto-suggests this preset when a local provider is selected. Improvements over #781: - Shared LOCAL_INFERENCE_PROVIDERS constant (no duplicated arrays) - Removed `nim-local` (not in the approved providers list) - Content-level tests for preset hosts, ports, and binary restrictions - Auto-suggestion tests for local vs cloud providers Based on work by @cluster2600 in #781. Co-Authored-By: cluster2600 <cluster2600@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-17T03:49:20Z

📝 Walkthrough

Walkthrough

A new "local-inference" policy preset is introduced that permits controlled network access to local LLM inference services (Ollama on port 11434 and vLLM on port 8000) running on host.openshell.internal. The onboarding flow is extended to automatically suggest this preset when a local inference provider is selected.

Changes

Cohort / File(s)	Summary
New Policy Preset `nemoclaw-blueprint/policies/presets/local-inference.yaml`	Defines a new `local-inference` preset with network policy rules allowing `GET` and `POST` requests to Ollama (port 11434) and vLLM (port 8000) endpoints, and associates binaries `/usr/local/bin/openclaw` and `/usr/local/bin/claude`.
Onboarding Logic `src/lib/onboard.ts`	Adds `LOCAL_INFERENCE_PROVIDERS` constant and extends `getSuggestedPolicyPresets()` and `setupPoliciesWithSelection()` to accept an optional `provider` parameter; automatically includes `"local-inference"` preset when provider is a local inference type.
Test Coverage `test/onboard.test.ts`, `test/policies.test.ts`	Adds unit tests validating that `"local-inference"` preset is suggested for local providers (`"ollama-local"`, `"vllm-local"`) and excluded for cloud/non-local providers; updates preset inventory tests to verify the new preset exists with correct endpoints and binaries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A whisker-twitch of joy I feel,
Local models now have seal!
Through preset paths and onboard flow,
Ollama and vLLM steal the show! 🚀
Endpoints guarded, binaries blessed,
Inference dreams put to the test!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding a new local-inference policy preset for Ollama/vLLM host access, which aligns with all four modified files' primary purpose.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/693-local-inference-preset

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

test/onboard.test.ts (1)

157-162: Add an explicit empty-string provider assertion.

Line 161 checks missing provider ({}), but not provider: "". Adding it would lock in the null/empty-input behavior explicitly.

♻️ Suggested test hardening

   it("does not suggest local-inference for cloud providers", () => {
     expect(getSuggestedPolicyPresets({ provider: "nvidia-prod" })).not.toContain("local-inference");
     expect(getSuggestedPolicyPresets({ provider: "openai-api" })).not.toContain("local-inference");
     expect(getSuggestedPolicyPresets({ provider: null })).not.toContain("local-inference");
+    expect(getSuggestedPolicyPresets({ provider: "" })).not.toContain("local-inference");
     expect(getSuggestedPolicyPresets({})).not.toContain("local-inference");
   });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 157 - 162, The test "does not suggest
local-inference for cloud providers" is missing an explicit assertion for an
empty-string provider; update the test containing getSuggestedPolicyPresets to
also assert that getSuggestedPolicyPresets({ provider: "" }) does not contain
"local-inference" so the behavior for empty-string inputs is locked in (add the
expect call alongside the existing expectations referencing
getSuggestedPolicyPresets).

test/policies.test.ts (1)

99-102: Avoid duplicating the preset count and name source of truth.

The explicit 12 can drift from the expected names array. You can derive count from expected.length in the same assertion block for easier maintenance.

Also applies to: 116-130
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/policies.test.ts` around lines 99 - 102, The test for
policies.listPresets() currently asserts a hardcoded count (12); replace the
magic number with a derived value from the expected names array (use
expected.length) so the length assertion and the name assertions use the same
single source of truth (referencing policies.listPresets() and the expected
array used later in the test).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/onboard.test.ts`:
- Around line 157-162: The test "does not suggest local-inference for cloud
providers" is missing an explicit assertion for an empty-string provider; update
the test containing getSuggestedPolicyPresets to also assert that
getSuggestedPolicyPresets({ provider: "" }) does not contain "local-inference"
so the behavior for empty-string inputs is locked in (add the expect call
alongside the existing expectations referencing getSuggestedPolicyPresets).

In `@test/policies.test.ts`:
- Around line 99-102: The test for policies.listPresets() currently asserts a
hardcoded count (12); replace the magic number with a derived value from the
expected names array (use expected.length) so the length assertion and the name
assertions use the same single source of truth (referencing
policies.listPresets() and the expected array used later in the test).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ed0bf61-6f4a-4f61-842c-8f5a1e63930b

📥 Commits

Reviewing files that changed from the base of the PR and between c133bab and f7bbcbd.

📒 Files selected for processing (4)

nemoclaw-blueprint/policies/presets/local-inference.yaml
src/lib/onboard.ts
test/onboard.test.ts
test/policies.test.ts

## Summary Fixes the remaining open issue in #709 — local Ollama inference from inside sandbox containers returns HTTP 403/401 even with `local-inference` policy enabled. **Root cause:** PR #1922 introduced an authenticated reverse proxy on port **11435**, but PR #2000's `local-inference` policy preset only allows port **11434** (direct Ollama) and **8000** (vLLM). On non-WSL Linux systems, container traffic is routed to port 11435 (`src/lib/local-inference.ts:21`), which the policy blocks. **Changes:** - **`local-inference.yaml`**: Add port 11435 (auth proxy) endpoint so containers can reach the proxy on non-WSL systems - **`onboard.ts`**: Upgrade proxy startup failure from a soft warning to a hard error with actionable diagnostics — prevents onboarding from completing with a broken provider config - **`policies.test.ts`**: Assert port 11435 is present in the preset to prevent regression ## Test plan - [x] All 79 policy tests pass (including updated port assertion) - [x] All 34 local-inference unit tests pass - [x] 125/129 onboard tests pass (4 pre-existing timeout failures in unrelated `--from` test) - [x] Ollama proxy recovery tests pass - [ ] Manual: onboard with Local Ollama on Linux Docker CE, verify inference works through proxy  ## Summary by CodeRabbit * **New Features** * Added support for an additional local inference network endpoint (host internal, port 11435). * **Bug Fixes** * Onboarding now fails gracefully when the local auth proxy cannot start and logs actionable diagnostic hints. * **Tests** * Updated tests and mocks to validate the new endpoint and the revised onboarding behavior.

## Summary Updates the shared NemoClaw maintainer policy references used by triage agents and downstream consumers. This PR keeps the policy package as the source of truth and tightens the taxonomy/routing guidance needed for the nvoss-velocity shared-policy migration: - clarifies that normal initial triage should not newly add `needs: triage` - narrows `needs:*` guidance to blocking action queues, especially `needs: info` and `needs: design` - adds PR type evidence from conventional commit prefixes - tightens specific integration routing so named Hermes/OpenClaw/etc. evidence gets the matching `integration:*` label - clarifies `area: ci` versus `area: e2e` so CI is not added just because an e2e failure was observed in CI - improves Windows ARM / WSL platform routing signals - adds concrete shared-policy examples for named integrations, OpenClaw e2e failures, and Windows ARM install failures No application code changes are included here. ## Evaluation This policy version was evaluated in the nvoss-velocity migration experiment using a deterministic 100-item NemoClaw fixture: 50 issues and 50 PRs, all numbered `#2000+`. The evaluation compares recommendations after translating legacy labels into the new shared-policy concepts, rather than raw old-label equality. Baseline, current Llama with this v2 policy: - Label F1: `0.485` - Label precision: `0.402` - Label recall: `0.611` - Issue Type match: `0.976` - Project field match: `1.000` - Parse failures: `0` Best Nemotron Ultra Preview candidate used the same policy files: - Label F1: `0.493` - Label precision: `0.392` - Label recall: `0.663` - Issue Type match: `0.976` - Project field match: `1.000` - Parse failures: `0` The Ultra result did not require separate policy-file changes; it used the same v2 shared policy plus model-specific runtime controls in nvoss-velocity. ## Validation - Parsed `.agents/skills/nemoclaw-maintainer-policies/references/label-taxonomy.json` with `python3 -m json.tool` - Ran `git diff --check` - Confirmed this branch starts from current `origin/main` - Confirmed only the shared policy reference files are modified ## Notes This is the policy-only first step. A follow-up nvoss-velocity PR will consume this shared policy package and switch normal triage runtime behavior to the selected model profile.  ## Summary by CodeRabbit * **Documentation** * Refined issue triage and labeling guidance with clarified rules for label taxonomy, improved concrete examples, and tightened requirements for maintainer workflows to ensure better issue categorization and reduce placeholder labeling.

ericksoa self-assigned this Apr 17, 2026

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

cv approved these changes Apr 17, 2026

View reviewed changes

ericksoa merged commit c7e7527 into main Apr 17, 2026
15 checks passed

ericksoa mentioned this pull request Apr 17, 2026

fix: add local-inference policy preset for Ollama/vLLM host access (#693) #781

Closed

2 tasks

JoyceChenNV mentioned this pull request Apr 20, 2026

[Ubuntu + Docker CE] Linux onboarding for Ollama does not explain required container-reachable bind address #709

Closed

2 tasks

ericksoa mentioned this pull request Apr 20, 2026

fix: add auth proxy port to local-inference policy (#709) #2114

Merged

5 tasks

This was referenced Apr 21, 2026

fix: macOS local inference DNS resolution + oMLX provider #464

Closed

fix: resolve inference.local DNS inside sandbox on macOS #471

Closed

cv mentioned this pull request May 14, 2026

[macOS][Ollama] Local Ollama inference hangs through NemoClaw/OpenShell routing; sandbox readiness/status becomes unstable #1410

Closed

github-actions Bot mentioned this pull request Jun 8, 2026

docs(policy): refine shared triage taxonomy #4935

Merged

wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add local-inference policy preset for Ollama/vLLM host access (Fixes #693)#2000

fix: add local-inference policy preset for Ollama/vLLM host access (Fixes #693)#2000
ericksoa merged 1 commit into
mainfrom
fix/693-local-inference-preset

ericksoa commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ericksoa commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Improvements over #781

Credit

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ericksoa commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading