fix: add local-inference policy preset for Ollama/vLLM host access (Fixes #693)#2000
Conversation
…ixes #693) New `local-inference.yaml` preset allows `host.openshell.internal` on ports 11434 (Ollama) and 8000 (vLLM) with binary restrictions. Onboard auto-suggests this preset when a local provider is selected. Improvements over #781: - Shared LOCAL_INFERENCE_PROVIDERS constant (no duplicated arrays) - Removed `nim-local` (not in the approved providers list) - Content-level tests for preset hosts, ports, and binary restrictions - Auto-suggestion tests for local vs cloud providers Based on work by @cluster2600 in #781. Co-Authored-By: cluster2600 <cluster2600@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughA new "local-inference" policy preset is introduced that permits controlled network access to local LLM inference services (Ollama on port 11434 and vLLM on port 8000) running on Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/onboard.test.ts (1)
157-162: Add an explicit empty-string provider assertion.Line 161 checks missing provider (
{}), but notprovider: "". Adding it would lock in the null/empty-input behavior explicitly.♻️ Suggested test hardening
it("does not suggest local-inference for cloud providers", () => { expect(getSuggestedPolicyPresets({ provider: "nvidia-prod" })).not.toContain("local-inference"); expect(getSuggestedPolicyPresets({ provider: "openai-api" })).not.toContain("local-inference"); expect(getSuggestedPolicyPresets({ provider: null })).not.toContain("local-inference"); + expect(getSuggestedPolicyPresets({ provider: "" })).not.toContain("local-inference"); expect(getSuggestedPolicyPresets({})).not.toContain("local-inference"); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.ts` around lines 157 - 162, The test "does not suggest local-inference for cloud providers" is missing an explicit assertion for an empty-string provider; update the test containing getSuggestedPolicyPresets to also assert that getSuggestedPolicyPresets({ provider: "" }) does not contain "local-inference" so the behavior for empty-string inputs is locked in (add the expect call alongside the existing expectations referencing getSuggestedPolicyPresets).test/policies.test.ts (1)
99-102: Avoid duplicating the preset count and name source of truth.The explicit
12can drift from the expected names array. You can derive count fromexpected.lengthin the same assertion block for easier maintenance.Also applies to: 116-130
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/policies.test.ts` around lines 99 - 102, The test for policies.listPresets() currently asserts a hardcoded count (12); replace the magic number with a derived value from the expected names array (use expected.length) so the length assertion and the name assertions use the same single source of truth (referencing policies.listPresets() and the expected array used later in the test).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/onboard.test.ts`:
- Around line 157-162: The test "does not suggest local-inference for cloud
providers" is missing an explicit assertion for an empty-string provider; update
the test containing getSuggestedPolicyPresets to also assert that
getSuggestedPolicyPresets({ provider: "" }) does not contain "local-inference"
so the behavior for empty-string inputs is locked in (add the expect call
alongside the existing expectations referencing getSuggestedPolicyPresets).
In `@test/policies.test.ts`:
- Around line 99-102: The test for policies.listPresets() currently asserts a
hardcoded count (12); replace the magic number with a derived value from the
expected names array (use expected.length) so the length assertion and the name
assertions use the same single source of truth (referencing
policies.listPresets() and the expected array used later in the test).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 0ed0bf61-6f4a-4f61-842c-8f5a1e63930b
📒 Files selected for processing (4)
nemoclaw-blueprint/policies/presets/local-inference.yamlsrc/lib/onboard.tstest/onboard.test.tstest/policies.test.ts
## Summary Fixes the remaining open issue in #709 — local Ollama inference from inside sandbox containers returns HTTP 403/401 even with `local-inference` policy enabled. **Root cause:** PR #1922 introduced an authenticated reverse proxy on port **11435**, but PR #2000's `local-inference` policy preset only allows port **11434** (direct Ollama) and **8000** (vLLM). On non-WSL Linux systems, container traffic is routed to port 11435 (`src/lib/local-inference.ts:21`), which the policy blocks. **Changes:** - **`local-inference.yaml`**: Add port 11435 (auth proxy) endpoint so containers can reach the proxy on non-WSL systems - **`onboard.ts`**: Upgrade proxy startup failure from a soft warning to a hard error with actionable diagnostics — prevents onboarding from completing with a broken provider config - **`policies.test.ts`**: Assert port 11435 is present in the preset to prevent regression ## Test plan - [x] All 79 policy tests pass (including updated port assertion) - [x] All 34 local-inference unit tests pass - [x] 125/129 onboard tests pass (4 pre-existing timeout failures in unrelated `--from` test) - [x] Ollama proxy recovery tests pass - [ ] Manual: onboard with Local Ollama on Linux Docker CE, verify inference works through proxy <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added support for an additional local inference network endpoint (host internal, port 11435). * **Bug Fixes** * Onboarding now fails gracefully when the local auth proxy cannot start and logs actionable diagnostic hints. * **Tests** * Updated tests and mocks to validate the new endpoint and the revised onboarding behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Updates the shared NemoClaw maintainer policy references used by triage agents and downstream consumers. This PR keeps the policy package as the source of truth and tightens the taxonomy/routing guidance needed for the nvoss-velocity shared-policy migration: - clarifies that normal initial triage should not newly add `needs: triage` - narrows `needs:*` guidance to blocking action queues, especially `needs: info` and `needs: design` - adds PR type evidence from conventional commit prefixes - tightens specific integration routing so named Hermes/OpenClaw/etc. evidence gets the matching `integration:*` label - clarifies `area: ci` versus `area: e2e` so CI is not added just because an e2e failure was observed in CI - improves Windows ARM / WSL platform routing signals - adds concrete shared-policy examples for named integrations, OpenClaw e2e failures, and Windows ARM install failures No application code changes are included here. ## Evaluation This policy version was evaluated in the nvoss-velocity migration experiment using a deterministic 100-item NemoClaw fixture: 50 issues and 50 PRs, all numbered `#2000+`. The evaluation compares recommendations after translating legacy labels into the new shared-policy concepts, rather than raw old-label equality. Baseline, current Llama with this v2 policy: - Label F1: `0.485` - Label precision: `0.402` - Label recall: `0.611` - Issue Type match: `0.976` - Project field match: `1.000` - Parse failures: `0` Best Nemotron Ultra Preview candidate used the same policy files: - Label F1: `0.493` - Label precision: `0.392` - Label recall: `0.663` - Issue Type match: `0.976` - Project field match: `1.000` - Parse failures: `0` The Ultra result did not require separate policy-file changes; it used the same v2 shared policy plus model-specific runtime controls in nvoss-velocity. ## Validation - Parsed `.agents/skills/nemoclaw-maintainer-policies/references/label-taxonomy.json` with `python3 -m json.tool` - Ran `git diff --check` - Confirmed this branch starts from current `origin/main` - Confirmed only the shared policy reference files are modified ## Notes This is the policy-only first step. A follow-up nvoss-velocity PR will consume this shared policy package and switch normal triage runtime behavior to the selected model profile. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Refined issue triage and labeling guidance with clarified rules for label taxonomy, improved concrete examples, and tightened requirements for maintainer workflows to ensure better issue categorization and reduce placeholder labeling. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
local-inference.yamlpreset allowshost.openshell.internalon ports 11434 (Ollama) and 8000 (vLLM) with binary restrictions (openclaw,claude).ollama-local,vllm-local) is selected.Improvements over #781
LOCAL_INFERENCE_PROVIDERSused in bothgetSuggestedPolicyPresets()andsetupPoliciesWithSelection()— no duplicated arrays that can drift.nim-local: Not in the approved providers list ininference-config.ts.Credit
Based on work by @cluster2600 in #781. The preset YAML and auto-suggest approach are theirs; this PR refactors and adds test coverage.
Test plan
ollama-localandvllm-localollama-local, verifylocal-inferenceis pre-selected🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes