Skip to content

fix: add local-inference policy preset for Ollama/vLLM host access (Fixes #693)#2000

Merged
ericksoa merged 1 commit into
mainfrom
fix/693-local-inference-preset
Apr 17, 2026
Merged

fix: add local-inference policy preset for Ollama/vLLM host access (Fixes #693)#2000
ericksoa merged 1 commit into
mainfrom
fix/693-local-inference-preset

Conversation

@ericksoa

@ericksoa ericksoa commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Improvements over #781

  • Shared constant: LOCAL_INFERENCE_PROVIDERS used in both getSuggestedPolicyPresets() and setupPoliciesWithSelection() — no duplicated arrays that can drift.
  • Removed nim-local: Not in the approved providers list in inference-config.ts.
  • Content-level preset tests: Assert actual hosts, ports, and binary restrictions (not just count/name).
  • Auto-suggestion tests: Verify local providers get the preset suggested, cloud providers don't.

Credit

Based on work by @cluster2600 in #781. The preset YAML and auto-suggest approach are theirs; this PR refactors and adds test coverage.

Test plan

  • Preset content tests (hosts, ports, binaries)
  • Auto-suggestion tests for ollama-local and vllm-local
  • Negative tests for cloud providers
  • All 200 existing policy + onboard tests pass
  • Manual: onboard with ollama-local, verify local-inference is pre-selected

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features
    • Added local-inference policy preset with network policies for HTTP(S) endpoints supporting GET and POST requests on ports 11434 and 8000.
    • Onboarding flow now automatically suggests local-inference policy when using local model providers.

…ixes #693)

New `local-inference.yaml` preset allows `host.openshell.internal` on
ports 11434 (Ollama) and 8000 (vLLM) with binary restrictions. Onboard
auto-suggests this preset when a local provider is selected.

Improvements over #781:
- Shared LOCAL_INFERENCE_PROVIDERS constant (no duplicated arrays)
- Removed `nim-local` (not in the approved providers list)
- Content-level tests for preset hosts, ports, and binary restrictions
- Auto-suggestion tests for local vs cloud providers

Based on work by @cluster2600 in #781.

Co-Authored-By: cluster2600 <cluster2600@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

A new "local-inference" policy preset is introduced that permits controlled network access to local LLM inference services (Ollama on port 11434 and vLLM on port 8000) running on host.openshell.internal. The onboarding flow is extended to automatically suggest this preset when a local inference provider is selected.

Changes

Cohort / File(s) Summary
New Policy Preset
nemoclaw-blueprint/policies/presets/local-inference.yaml
Defines a new local-inference preset with network policy rules allowing GET and POST requests to Ollama (port 11434) and vLLM (port 8000) endpoints, and associates binaries /usr/local/bin/openclaw and /usr/local/bin/claude.
Onboarding Logic
src/lib/onboard.ts
Adds LOCAL_INFERENCE_PROVIDERS constant and extends getSuggestedPolicyPresets() and setupPoliciesWithSelection() to accept an optional provider parameter; automatically includes "local-inference" preset when provider is a local inference type.
Test Coverage
test/onboard.test.ts, test/policies.test.ts
Adds unit tests validating that "local-inference" preset is suggested for local providers ("ollama-local", "vllm-local") and excluded for cloud/non-local providers; updates preset inventory tests to verify the new preset exists with correct endpoints and binaries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A whisker-twitch of joy I feel,
Local models now have seal!
Through preset paths and onboard flow,
Ollama and vLLM steal the show! 🚀
Endpoints guarded, binaries blessed,
Inference dreams put to the test!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a new local-inference policy preset for Ollama/vLLM host access, which aligns with all four modified files' primary purpose.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/693-local-inference-preset

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericksoa ericksoa self-assigned this Apr 17, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/onboard.test.ts (1)

157-162: Add an explicit empty-string provider assertion.

Line 161 checks missing provider ({}), but not provider: "". Adding it would lock in the null/empty-input behavior explicitly.

♻️ Suggested test hardening
   it("does not suggest local-inference for cloud providers", () => {
     expect(getSuggestedPolicyPresets({ provider: "nvidia-prod" })).not.toContain("local-inference");
     expect(getSuggestedPolicyPresets({ provider: "openai-api" })).not.toContain("local-inference");
     expect(getSuggestedPolicyPresets({ provider: null })).not.toContain("local-inference");
+    expect(getSuggestedPolicyPresets({ provider: "" })).not.toContain("local-inference");
     expect(getSuggestedPolicyPresets({})).not.toContain("local-inference");
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/onboard.test.ts` around lines 157 - 162, The test "does not suggest
local-inference for cloud providers" is missing an explicit assertion for an
empty-string provider; update the test containing getSuggestedPolicyPresets to
also assert that getSuggestedPolicyPresets({ provider: "" }) does not contain
"local-inference" so the behavior for empty-string inputs is locked in (add the
expect call alongside the existing expectations referencing
getSuggestedPolicyPresets).
test/policies.test.ts (1)

99-102: Avoid duplicating the preset count and name source of truth.

The explicit 12 can drift from the expected names array. You can derive count from expected.length in the same assertion block for easier maintenance.

Also applies to: 116-130

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/policies.test.ts` around lines 99 - 102, The test for
policies.listPresets() currently asserts a hardcoded count (12); replace the
magic number with a derived value from the expected names array (use
expected.length) so the length assertion and the name assertions use the same
single source of truth (referencing policies.listPresets() and the expected
array used later in the test).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/onboard.test.ts`:
- Around line 157-162: The test "does not suggest local-inference for cloud
providers" is missing an explicit assertion for an empty-string provider; update
the test containing getSuggestedPolicyPresets to also assert that
getSuggestedPolicyPresets({ provider: "" }) does not contain "local-inference"
so the behavior for empty-string inputs is locked in (add the expect call
alongside the existing expectations referencing getSuggestedPolicyPresets).

In `@test/policies.test.ts`:
- Around line 99-102: The test for policies.listPresets() currently asserts a
hardcoded count (12); replace the magic number with a derived value from the
expected names array (use expected.length) so the length assertion and the name
assertions use the same single source of truth (referencing
policies.listPresets() and the expected array used later in the test).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ed0bf61-6f4a-4f61-842c-8f5a1e63930b

📥 Commits

Reviewing files that changed from the base of the PR and between c133bab and f7bbcbd.

📒 Files selected for processing (4)
  • nemoclaw-blueprint/policies/presets/local-inference.yaml
  • src/lib/onboard.ts
  • test/onboard.test.ts
  • test/policies.test.ts

@ericksoa ericksoa merged commit c7e7527 into main Apr 17, 2026
15 checks passed
ericksoa added a commit that referenced this pull request Apr 21, 2026
## Summary

Fixes the remaining open issue in #709 — local Ollama inference from
inside sandbox containers returns HTTP 403/401 even with
`local-inference` policy enabled.

**Root cause:** PR #1922 introduced an authenticated reverse proxy on
port **11435**, but PR #2000's `local-inference` policy preset only
allows port **11434** (direct Ollama) and **8000** (vLLM). On non-WSL
Linux systems, container traffic is routed to port 11435
(`src/lib/local-inference.ts:21`), which the policy blocks.

**Changes:**
- **`local-inference.yaml`**: Add port 11435 (auth proxy) endpoint so
containers can reach the proxy on non-WSL systems
- **`onboard.ts`**: Upgrade proxy startup failure from a soft warning to
a hard error with actionable diagnostics — prevents onboarding from
completing with a broken provider config
- **`policies.test.ts`**: Assert port 11435 is present in the preset to
prevent regression

## Test plan

- [x] All 79 policy tests pass (including updated port assertion)
- [x] All 34 local-inference unit tests pass
- [x] 125/129 onboard tests pass (4 pre-existing timeout failures in
unrelated `--from` test)
- [x] Ollama proxy recovery tests pass
- [ ] Manual: onboard with Local Ollama on Linux Docker CE, verify
inference works through proxy

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added support for an additional local inference network endpoint (host
internal, port 11435).

* **Bug Fixes**
* Onboarding now fails gracefully when the local auth proxy cannot start
and logs actionable diagnostic hints.

* **Tests**
* Updated tests and mocks to validate the new endpoint and the revised
onboarding behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
cv pushed a commit that referenced this pull request Jun 8, 2026
## Summary

Updates the shared NemoClaw maintainer policy references used by triage
agents and downstream consumers.

This PR keeps the policy package as the source of truth and tightens the
taxonomy/routing guidance needed for the nvoss-velocity shared-policy
migration:

- clarifies that normal initial triage should not newly add `needs:
triage`
- narrows `needs:*` guidance to blocking action queues, especially
`needs: info` and `needs: design`
- adds PR type evidence from conventional commit prefixes
- tightens specific integration routing so named Hermes/OpenClaw/etc.
evidence gets the matching `integration:*` label
- clarifies `area: ci` versus `area: e2e` so CI is not added just
because an e2e failure was observed in CI
- improves Windows ARM / WSL platform routing signals
- adds concrete shared-policy examples for named integrations, OpenClaw
e2e failures, and Windows ARM install failures

No application code changes are included here.

## Evaluation

This policy version was evaluated in the nvoss-velocity migration
experiment using a deterministic 100-item NemoClaw fixture: 50 issues
and 50 PRs, all numbered `#2000+`.

The evaluation compares recommendations after translating legacy labels
into the new shared-policy concepts, rather than raw old-label equality.

Baseline, current Llama with this v2 policy:

- Label F1: `0.485`
- Label precision: `0.402`
- Label recall: `0.611`
- Issue Type match: `0.976`
- Project field match: `1.000`
- Parse failures: `0`

Best Nemotron Ultra Preview candidate used the same policy files:

- Label F1: `0.493`
- Label precision: `0.392`
- Label recall: `0.663`
- Issue Type match: `0.976`
- Project field match: `1.000`
- Parse failures: `0`

The Ultra result did not require separate policy-file changes; it used
the same v2 shared policy plus model-specific runtime controls in
nvoss-velocity.

## Validation

- Parsed
`.agents/skills/nemoclaw-maintainer-policies/references/label-taxonomy.json`
with `python3 -m json.tool`
- Ran `git diff --check`
- Confirmed this branch starts from current `origin/main`
- Confirmed only the shared policy reference files are modified

## Notes

This is the policy-only first step. A follow-up nvoss-velocity PR will
consume this shared policy package and switch normal triage runtime
behavior to the selected model profile.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Refined issue triage and labeling guidance with clarified rules for
label taxonomy, improved concrete examples, and tightened requirements
for maintainer workflows to ensure better issue categorization and
reduce placeholder labeling.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants