fix(security): add authenticated reverse proxy for local Ollama by prekshivyas · Pull Request #1922 · NVIDIA/NemoClaw

prekshivyas · 2026-04-15T21:25:42Z

Summary

Closes #709. Supersedes #1140.

Ollama has no built-in auth. Today onboarding tells users to bind 0.0.0.0:11434, which exposes Ollama to the entire network with no authentication (CWE-668, flagged in #1140). This reimplements the auth proxy approach from #679 against the current TypeScript codebase.

Fix — keep Ollama on localhost, add an authenticated proxy:

Auth proxy (scripts/ollama-auth-proxy.js): Lightweight Node.js reverse proxy on 0.0.0.0:11435 that validates a per-instance Bearer token before forwarding to Ollama on 127.0.0.1:11434. Uses crypto.timingSafeEqual for timing-safe comparison. GET /api/tags is exempt for container health checks.
Onboard integration: Ollama binds to 127.0.0.1 instead of 0.0.0.0. Proxy starts automatically after Ollama with a random 24-byte token. Stale proxies from previous runs are cleaned up. Startup is verified before proceeding.
Inference routing: Sandbox provider uses proxy port (11435) with the generated token as the OPENAI_API_KEY credential. OpenShell's L7 proxy injects the token at egress.
Platform fix: macOS hint ("On macOS, local inference also depends on OpenShell host routing support.") now gated on process.platform === "darwin" instead of always shown.

Test plan

7 e2e tests (mock backend, no real Ollama needed): unauthenticated rejection, wrong token rejection, correct token proxying, health check exemption, POST-to-health-check requires auth
34 unit tests passing (updated port expectations and error messages)
All pre-commit hooks pass (gitleaks, shellcheck, ESLint, commitlint, TypeScript)
Manual: nemoclaw onboard with Local Ollama, verify inference works through proxy

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Token-gated local authentication proxy for Ollama with persisted credential, automatic lifecycle management, and container-access routing; unauthenticated GET /api/tags remains allowed.
- Host-side connect now ensures the proxy is running after reconnects.
Tests
- Added end-to-end tests validating proxy routing and authentication behavior.
- Updated local tests to expect proxy-mediated access.
Chores
- CI job added to run proxy e2e tests; minor devDependency reordering.

Ollama has no built-in auth and binding to 0.0.0.0 exposes it to the network (CWE-668, #1140). This adds an authenticated reverse proxy so Ollama stays on localhost while containers can still reach it. - Add scripts/ollama-auth-proxy.js — Node.js proxy on 0.0.0.0:11435 that validates a per-instance Bearer token before forwarding to Ollama on 127.0.0.1:11434. Health check (GET /api/tags) is exempt. Uses crypto.timingSafeEqual for timing-safe token comparison. - Bind Ollama to 127.0.0.1 instead of 0.0.0.0 during onboard - Start the auth proxy after Ollama, with stale proxy cleanup and startup verification - Route sandbox inference through proxy port (11435) with the generated token as the OpenAI API key credential - Gate macOS hint on process.platform === "darwin" - Add OLLAMA_PROXY_PORT (11435) to ports.ts - Add 7 e2e tests and CI job for the proxy - Update unit tests for new port and error messages Reimplements the approach from #679 (closed in favor of #1104) against the current TypeScript codebase, addressing CodeRabbit findings from the original PR (timing-safe comparison, stale proxy cleanup, startup verification). Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-15T21:25:54Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an Ollama authentication reverse proxy, integrates it into onboarding and runtime to start/ensure/use the proxy and token, routes container reachability checks through the proxy, updates tests and adds an e2e proxy test plus a CI job to run that test.

Changes

Cohort / File(s)	Summary
Proxy Server Implementation `scripts/ollama-auth-proxy.js`	New Node.js reverse proxy requiring `OLLAMA_PROXY_TOKEN` (timing-safe compare), forwards to backend at `127.0.0.1:${OLLAMA_BACKEND_PORT}`, strips `authorization` and `host`, allows unauthenticated `GET /api/tags`, returns 401/502 on auth/backend failures.
Onboarding & Runtime `src/lib/onboard.ts`, `src/nemoclaw.ts`	Adds proxy token persistence (`~/.nemoclaw/ollama-proxy-token`), start/kill/ensure helpers for the auth proxy, changes non-WSL Ollama binding to `127.0.0.1`, uses proxy token as `OPENAI_API_KEY` for non-WSL `ollama-local`, exports `ensureOllamaAuthProxy`, and calls it during sandbox connect.
Local Inference Routing & Tests `src/lib/local-inference.ts`, `src/lib/local-inference.test.ts`	Introduces `OLLAMA_CONTAINER_PORT` (uses `OLLAMA_PROXY_PORT` when not WSL); updates base URL and container reachability check to `host.openshell.internal:${OLLAMA_CONTAINER_PORT}`; tests parameterized for the port and expect proxy-related failure text.
Port Configuration `src/lib/ports.ts`	Added exported `OLLAMA_PROXY_PORT` (env `NEMOCLAW_OLLAMA_PROXY_PORT`, default `11435`).
E2E Test Script & CI `test/e2e-ollama-proxy.sh`, `.github/workflows/pr.yaml`	Adds e2e script that launches a mock backend and the auth proxy to validate auth success/failure and health-check exemption; adds `test-e2e-ollama-proxy` job to PR workflow to run the script when code changes.
Miscellaneous `package.json`	Reordered a devDependency entry for `ajv` (no version change).

Sequence Diagram(s)

sequenceDiagram
    participant Container as Client Container
    participant Proxy as Ollama Auth Proxy\n(Port 11435)
    participant Backend as Ollama Backend\n(Port 11434)

    rect rgba(0, 200, 100, 0.5)
    Note over Container,Backend: Authenticated Request Flow
    Container->>Proxy: POST /api/generate\nAuthorization: Bearer {token}
    Proxy->>Proxy: Validate token (timing-safe)
    Proxy->>Backend: Forward request (strip auth/host)
    Backend->>Proxy: Response stream
    Proxy->>Container: Response stream
    end

    rect rgba(200, 100, 0, 0.5)
    Note over Container,Proxy: Unauthorized Request Flow
    Container->>Proxy: POST /api/generate (no/invalid token)
    Proxy->>Container: HTTP 401 Unauthorized
    end

    rect rgba(100, 150, 200, 0.5)
    Note over Container,Backend: Health Check (Exempted)
    Container->>Proxy: GET /api/tags (no auth)
    Proxy->>Backend: Forward health check
    Backend->>Proxy: Model list
    Proxy->>Container: Model list
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I guard the tunnel on eleven-three-five,

Tokens snug while health checks may thrive.
I strip the headers, forward the call,
Reject the pretenders, let models stand tall.
A hopping proxy, so onboarding won't stall.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding an authenticated reverse proxy for local Ollama to improve security and container reachability.
Linked Issues check	✅ Passed	The PR fully addresses issue `#709`'s objective of enabling containerized components to reach local Ollama without binding to 0.0.0.0 via a lightweight authenticated proxy.
Out of Scope Changes check	✅ Passed	All changes directly support the authenticated proxy implementation: proxy script, onboarding integration, inference routing updates, tests, and CI job—no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/ollama-auth-proxy

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

wscurran · 2026-04-15T21:36:54Z

✨
Related open PRs:

Related open issues:

#709 [NeMoClaw][Ubuntu + Docker CE] Linux onboarding for Ollama does not explain required container-reachable bind address

Possibly related open PRs:

Possibly related open issues:

#709 [NeMoClaw][Ubuntu + Docker CE] Linux onboarding for Ollama does not explain required container-reachable bind address

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

.github/workflows/pr.yaml (1)
54-65: Gate this E2E job behind the existing changes filter.

Right now this runs on docs-only PRs too, so it bypasses the fast path the workflow already has for expensive jobs. If that is not intentional, add the same needs: [checks, changes] / if: needs.changes.outputs.code == 'true' guard used by sandbox-images-and-e2e.
♻️ Suggested change
  test-e2e-ollama-proxy:
+   needs: [checks, changes]
+   if: needs.changes.outputs.code == 'true'
    runs-on: ubuntu-latest
    timeout-minutes: 5
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/pr.yaml around lines 54 - 65, The E2E job
"test-e2e-ollama-proxy" should be gated by the existing changes filter: add the
same needs and conditional used by "sandbox-images-and-e2e" by adding needs:
[checks, changes] to the job definition and adding if:
needs.changes.outputs.code == 'true' so the job only runs when code changes are
present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1487-1522: Persist the generated ollamaProxyToken into the
onboarding session and handle resume: when startOllamaAuthProxy generates the
token, save it to onboardSession.ollamaProxyToken (or a similarly named field)
and reuse that value in getOllamaProxyToken instead of only the in-memory
variable; when resuming (e.g. in setupInference / after setupNim is skipped)
detect if the session provider is "ollama-local" and if
onboardSession.ollamaProxyToken exists reuse it and ensure the proxy is running,
otherwise call startOllamaAuthProxy to recreate the proxy and update
onboardSession.ollamaProxyToken so resumed runs always have a valid token and
proxy.
- Around line 1491-1499: The cleanup currently kills any PID from
runCapture(`lsof -ti :${OLLAMA_PROXY_PORT}`), which may terminate unrelated
services; instead track and validate the proxy process before killing: when you
spawn the proxy (where the proxy is started), record its PID (or write it to a
known file) and on cleanup read that PID and verify its command line contains
the proxy marker (e.g., "ollama-auth-proxy.js") using a ps lookup (or filter
lsof output by command) before calling run(`kill ...`); update the cleanup block
(the code using run, runCapture and OLLAMA_PROXY_PORT) to only kill the verified
PID and fall back to no-op if verification fails.

In `@test/e2e-ollama-proxy.sh`:
- Around line 17-20: The cleanup function and EXIT trap must guard against unset
PID variables to avoid unbound-variable errors under set -u; update cleanup (and
any EXIT trap invocation) to safely reference MOCK_PID and PROXY_PID using
parameter expansion or existence checks (e.g. ${MOCK_PID-} / ${PROXY_PID-} or if
[ -n "${MOCK_PID-}" ] ) before calling kill, so cleanup() only attempts kill
when the PID variables are set and non-empty.

---

Nitpick comments:
In @.github/workflows/pr.yaml:
- Around line 54-65: The E2E job "test-e2e-ollama-proxy" should be gated by the
existing changes filter: add the same needs and conditional used by
"sandbox-images-and-e2e" by adding needs: [checks, changes] to the job
definition and adding if: needs.changes.outputs.code == 'true' so the job only
runs when code changes are present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f4037d3e-ad05-4c00-a2da-d3698480e911

📥 Commits

Reviewing files that changed from the base of the PR and between f079a37 and 2f505e1.

📒 Files selected for processing (7)

.github/workflows/pr.yaml
scripts/ollama-auth-proxy.js
src/lib/local-inference.test.ts
src/lib/local-inference.ts
src/lib/onboard.ts
src/lib/ports.ts
test/e2e-ollama-proxy.sh

- Persist proxy token to ~/.nemoclaw/ollama-proxy-token (mode 0600) so it survives process restarts and onboard --resume - Add ensureOllamaAuthProxy() called on sandbox connect to auto-restart the proxy after host reboots - Restore WSL2 compatibility: skip proxy on WSL2 where Docker reaches the host directly (#1104), use OLLAMA_CONTAINER_PORT that adapts per platform - Fix e2e test: replace invalid 0.0.0.0 reachability test with localhost liveness check (0.0.0.0 as destination routes to loopback on both Linux and macOS) Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- killStaleProxy() now verifies process command contains "ollama-auth-proxy" before killing, avoiding termination of unrelated services on the same port - Initialize MOCK_PID/PROXY_PID and guard EXIT trap with ${:-} to prevent unbound-variable errors under set -u Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

test/e2e-ollama-proxy.sh (1)

17-20: ⚠️ Potential issue | 🟡 Minor

Guard cleanup against unset PIDs under set -u.

Lines [17]-[20] can throw an unbound-variable error if the script exits before Lines [47] or [55], which obscures the real failure.

🐚 Suggested fix

+MOCK_PID=""
+PROXY_PID=""
+
 cleanup() {
-  kill "$MOCK_PID" 2>/dev/null || true
-  kill "$PROXY_PID" 2>/dev/null || true
+  [ -n "${MOCK_PID:-}" ] && kill "$MOCK_PID" 2>/dev/null || true
+  [ -n "${PROXY_PID:-}" ] && kill "$PROXY_PID" 2>/dev/null || true
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e-ollama-proxy.sh` around lines 17 - 20, The cleanup function calls
kill on MOCK_PID and PROXY_PID which will cause an unbound-variable error under
set -u if those PIDs were never set; update cleanup to test each PID variable
safely (e.g. use parameter expansion like "${MOCK_PID-}" / "${PROXY_PID-}" or an
explicit [ -n "${MOCK_PID-}" ] check) before calling kill, and ensure you only
attempt kill when the variable is non-empty to avoid masking the original error
and to clean up processes reliably.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e-ollama-proxy.sh`:
- Around line 89-96: The test currently calls the auth-exempt endpoint /api/tags
so it doesn't validate auth; change the request in test/e2e-ollama-proxy.sh that
builds BODY (using CORRECT_AUTH, TOKEN, PROXY_PORT) to call a protected endpoint
such as /api/generate instead of /api/tags, use the correct HTTP method and JSON
payload required by /api/generate in the curl invocation (keep the -H
"Authorization: $CORRECT_AUTH" header), then assert the response BODY contains
the expected generation marker and keep the same pass/fail logic.

---

Duplicate comments:
In `@test/e2e-ollama-proxy.sh`:
- Around line 17-20: The cleanup function calls kill on MOCK_PID and PROXY_PID
which will cause an unbound-variable error under set -u if those PIDs were never
set; update cleanup to test each PID variable safely (e.g. use parameter
expansion like "${MOCK_PID-}" / "${PROXY_PID-}" or an explicit [ -n
"${MOCK_PID-}" ] check) before calling kill, and ensure you only attempt kill
when the variable is non-empty to avoid masking the original error and to clean
up processes reliably.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cc76b675-5b18-435b-bf2e-7b78105bf763

📥 Commits

Reviewing files that changed from the base of the PR and between 8ba4eca and 1916df2.

📒 Files selected for processing (6)

package.json
src/lib/local-inference.test.ts
src/lib/local-inference.ts
src/lib/onboard.ts
src/nemoclaw.ts
test/e2e-ollama-proxy.sh

✅ Files skipped from review due to trivial changes (1)

package.json

🚧 Files skipped from review as they are similar to previous changes (3)

src/lib/local-inference.test.ts
src/lib/onboard.ts
src/lib/local-inference.ts

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1535-1536: The code currently generates ollamaProxyToken in
startOllamaAuthProxy() and immediately calls persistProxyToken(ollamaProxyToken)
which can leave a stale token if the user cancels or chooses a different
provider; change this so the token is only persisted after the provider
selection/validation is committed for the Ollama branch (i.e., persist inside
the final path that sets provider to "ollama-local" / after model selection
succeeds), or alternatively add a cleanup path that clears the persisted token
when the onboarding flow leaves the Ollama branch; update references to
ensureOllamaAuthProxy() usage so it only finds a persisted token when Ollama was
actually chosen and ensure ollamaProxyToken is persisted/cleared consistently
with that flow.
- Around line 1546-1579: The current readiness check in ensureOllamaAuthProxy
uses an unauthenticated GET /api/tags (via runCapture) which can pass against an
unrelated listener; replace it with a probe that verifies the specific proxy
instance — either (A) perform an authenticated probe using the persisted token
(call runCapture with the Authorization header / token-bound endpoint so the
request only succeeds against a proxy that accepts that token, using
loadPersistedProxyToken before probing and set ollamaProxyToken accordingly) or
(B) verify the actual proxy process identity (store/read the proxy PID/command
when starting in run(...) and in ensureOllamaAuthProxy validate that PID is
running and matches the expected command). Update ensureOllamaAuthProxy to
return only when the authenticated/token-bound probe or PID/command check
succeeds; use runCapture/run and the existing OLLAMA_PROXY_PORT,
ollamaProxyToken, loadPersistedProxyToken, and scripts/ollama-auth-proxy.js
symbols to locate and change the logic.
- Around line 1493-1499: persistProxyToken currently relies on
fs.writeFileSync(..., { mode: 0o600 }) which only sets permissions on new files;
update persistProxyToken to explicitly reset permissions after writing by
calling fs.chmodSync(PROXY_TOKEN_PATH, 0o600) following the write to ensure the
token file at PROXY_TOKEN_PATH always has restrictive 0600 permissions
regardless of prior file state.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 3df64d1d-08b3-4771-8705-2cfecf8e4dfd

📥 Commits

Reviewing files that changed from the base of the PR and between 1916df2 and adfb16e.

📒 Files selected for processing (2)

src/lib/onboard.ts
test/e2e-ollama-proxy.sh

✅ Files skipped from review due to trivial changes (1)

test/e2e-ollama-proxy.sh

- Test 5 now uses POST /api/generate (protected) instead of GET /api/tags (auth-exempt) to verify token auth actually works - Gate test-e2e-ollama-proxy CI job behind changes filter to skip on docs-only PRs Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/pr.yaml:
- Around line 62-63: Replace the mutable ref for the setup-node action: locate
the uses entry "uses: actions/setup-node@v6" in the workflow and change it to
the equivalent immutable commit SHA (e.g., "uses:
actions/setup-node@<commit-sha> # v6") so the action is pinned to a specific
commit; ensure the chosen SHA matches the v6 release you intend to track.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1db7bb91-c8e0-468d-8b3d-1e535b29ff28

📥 Commits

Reviewing files that changed from the base of the PR and between adfb16e and 962caa6.

📒 Files selected for processing (2)

.github/workflows/pr.yaml
test/e2e-ollama-proxy.sh

✅ Files skipped from review due to trivial changes (1)

test/e2e-ollama-proxy.sh

…, proxy identity check - Delay persisting proxy token to disk until ollama-local is confirmed in setupInference, so backing out to another provider doesn't leave a stale token that resurrects the proxy on non-Ollama sandboxes - ensureOllamaAuthProxy now verifies the proxy accepts our token via an authenticated POST (not auth-exempt GET /api/tags), preventing false positives from stale proxies or unrelated listeners - Add chmodSync after writeFileSync to ensure 0600 on existing files (Node.js mode option only applies on file creation) Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Comprehensive e2e test using real Ollama (not mocks): - Install Ollama, pull qwen2.5:0.5b (small CPU model) - Start Ollama on 127.0.0.1 only, start auth proxy on 0.0.0.0:11435 - Token auth: reject unauthenticated, reject wrong token, accept correct - Real inference: /v1/chat/completions and /api/generate through proxy - Token persistence: file exists, 0600 permissions, content matches - Proxy recovery: kill, verify dead, restart from persisted token, verify inference works after restart - Container reachability: Docker container can reach proxy at host.openshell.internal:11435, cannot reach Ollama directly on 11434 Triggered via workflow_dispatch (manual) — not on every PR due to Ollama download and CPU inference time (~3-4 min). Also updates gpu-e2e test comments for auth proxy (127.0.0.1 binding). Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Addresses Aaron's review: "Add an end-to-end onboard/connect regression test that provisions ollama-local, persists the token, restarts/ensures the proxy, and verifies container-reachable inference through 11435." Adds Phase 4.5 to the existing gpu-e2e test (runs after onboard, before inference): - Token file persisted at ~/.nemoclaw/ollama-proxy-token - Token file permissions 600 - Auth proxy running on :11435 - Proxy rejects unauthenticated POST (401) - Proxy accepts persisted token - Container reaches proxy at host.openshell.internal:11435 - Proxy recovery: kill proxy, verify restart via ensureOllamaAuthProxy Updates Phase 5 comments: inference path now goes through auth proxy (:11435) → Ollama (:11434). Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The previous recovery test called `nemoclaw status` to trigger ensureOllamaAuthProxy, but that function is only called on `sandboxConnect`, not `status`. Fix by restarting the proxy directly from the persisted token (simulating what ensureOllamaAuthProxy does on connect after a reboot). Also adds verification that the recovered proxy accepts the original persisted token. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Prevents lingering proxy process on reused runners after Phase 4.5g manually restarts the proxy for recovery testing. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Refresh user-facing docs against the 34 commits merged between v0.0.17 and v0.0.18. Highlights: - Replace the Ollama 0.0.0.0 binding guidance with the new authenticated reverse proxy on 127.0.0.1:11435 (#1922). - Document the compatible-endpoint provider defaulting to /v1/chat/completions and the NEMOCLAW_PREFERRED_API=openai-responses opt-in (#1984). - Add the new nemoclaw upgrade-sandboxes command with --check, --auto, and --yes flags (#1943). - Note the cross-sandbox messaging overlap warning and 409 detection in nemoclaw <name> status (#1953). - Document the messaging-token rotation auto-rebuild flow (#1967). - Cover new troubleshooting entries for the Ollama auth proxy, IPv6 localhost resolution, orphan SSH port-forward cleanup on re-onboard, and rotated messaging credentials (#1978, #1950). - Note tar failure exit code for nemoclaw debug --output (#1770) and the orphaned openshell process cleanup in nemoclaw uninstall (#1940). Also: - Extend docs/.docs-skip to exclude the experimental sandbox-mgmt shields and config commands (#1976). - Fix a sphinx-autobuild infinite rebuild loop in docs/conf.py by writing docs/project.json only when its contents change. - Bump the docs version switcher preferred entry to 0.0.18. - Regenerate nemoclaw-user-* agent skills from docs/. Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Made-with: Cursor

## Summary Refresh user-facing documentation against the 34 commits merged between v0.0.17 and v0.0.18, bump the docs version switcher to v0.0.18, and fix a `sphinx-autobuild` infinite-rebuild loop triggered by `docs/conf.py`. ## Changes - **Ollama authenticated reverse proxy** (#1922): Replace the `0.0.0.0:11434` guidance in `docs/inference/use-local-inference.md` with the new token-gated proxy on `127.0.0.1:11435`, including persisted token, health-check exemption, and sandbox provider wiring. Replace the matching troubleshooting entry in `docs/reference/troubleshooting.md`. - **Compatible-endpoint default API path** (#1984): Document that the compatible-endpoint provider now defaults to `/v1/chat/completions` and update `NEMOCLAW_PREFERRED_API` to describe `openai-responses` as the opt-in instead of `openai-completions`. Updates in `use-local-inference.md`, `switch-inference-providers.md`, and `troubleshooting.md`. - **`nemoclaw upgrade-sandboxes` command** (#1943): Add a new reference entry in `docs/reference/commands.md` covering `--check`, `--auto`, and `--yes` flags. - **Messaging token rotation auto-rebuild** (#1967, #1953): Note the automatic rebuild behavior and cross-sandbox overlap warning in `docs/deployment/set-up-telegram-bridge.md`, `commands.md`, and `troubleshooting.md`. - **Other troubleshooting additions**: - `localhost` → `127.0.0.1` IPv6 note (#1978) - Orphan SSH port-forward cleanup on re-onboard (#1950) - Orphan `openshell` process cleanup in `nemoclaw uninstall` (#1940) - Non-zero exit on tar failure in `nemoclaw debug --output` (#1770) - **Skip list**: Extend `docs/.docs-skip` to exclude the experimental sandbox-mgmt shields and config commands feature (#1976), which was explicitly merged as not-yet-documented. - **Build stability**: `docs/conf.py` now writes `docs/project.json` only when contents change, so `make docs-live` / `sphinx-autobuild` no longer detects its own generated file as a source change and enters an infinite rebuild loop. - **Version switcher**: Bump `docs/versions1.json` and `docs/project.json` preferred entry to v0.0.18 so this refresh renders under the new version. - **Agent skills**: Regenerate `nemoclaw-user-*` skills from `docs/` with `scripts/docs-to-skills.py`. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [x] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes (ran via pre-commit hook on staged files) - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) ## AI Disclosure - [x] AI-assisted — tool: Cursor --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Made with [Cursor](https://cursor.com)  ## Summary by CodeRabbit ## Release Notes * **New Features** * Added `nemoclaw upgrade-sandboxes` command to rebuild sandboxes when base-image digests change. * Introduced authenticated reverse proxy for local Ollama inference with token-based access control. * Automatic sandbox backup, recreation, and restore when messaging credentials are updated. * Cross-sandbox messaging token overlap detection with status warnings. * **Improvements** * Compatible-endpoint provider now defaults to `/v1/chat/completions` API path. * Enhanced troubleshooting documentation with new diagnostics sections. * **Documentation** * Updated onboarding and configuration guides. * Expanded version documentation to 0.0.18.  Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

The argv migration inadvertently changed OLLAMA_HOST from 127.0.0.1 to 0.0.0.0, reverting the security fix in PR #1922 which moved Ollama to localhost and added an authenticated reverse proxy on 0.0.0.0:11435. Restore 127.0.0.1 on both the non-WSL Linux and macOS install paths.

## Summary Fixes the remaining open issue in #709 — local Ollama inference from inside sandbox containers returns HTTP 403/401 even with `local-inference` policy enabled. **Root cause:** PR #1922 introduced an authenticated reverse proxy on port **11435**, but PR #2000's `local-inference` policy preset only allows port **11434** (direct Ollama) and **8000** (vLLM). On non-WSL Linux systems, container traffic is routed to port 11435 (`src/lib/local-inference.ts:21`), which the policy blocks. **Changes:** - **`local-inference.yaml`**: Add port 11435 (auth proxy) endpoint so containers can reach the proxy on non-WSL systems - **`onboard.ts`**: Upgrade proxy startup failure from a soft warning to a hard error with actionable diagnostics — prevents onboarding from completing with a broken provider config - **`policies.test.ts`**: Assert port 11435 is present in the preset to prevent regression ## Test plan - [x] All 79 policy tests pass (including updated port assertion) - [x] All 34 local-inference unit tests pass - [x] 125/129 onboard tests pass (4 pre-existing timeout failures in unrelated `--from` test) - [x] Ollama proxy recovery tests pass - [ ] Manual: onboard with Local Ollama on Linux Docker CE, verify inference works through proxy  ## Summary by CodeRabbit * **New Features** * Added support for an additional local inference network endpoint (host internal, port 11435). * **Bug Fixes** * Onboarding now fails gracefully when the local auth proxy cannot start and logs actionable diagnostic hints. * **Tests** * Updated tests and mocks to validate the new endpoint and the revised onboarding behavior.

fix(ci): use actions/setup-node@v6 for ollama proxy e2e job

8ba4eca

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

wscurran added security Potential vulnerability, unsafe behavior, or access risk Local Models labels Apr 15, 2026

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread src/lib/onboard.ts

Comment thread src/lib/onboard.ts Outdated

Comment thread test/e2e-ollama-proxy.sh

prekshivyas and others added 2 commits April 15, 2026 14:49

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread test/e2e-ollama-proxy.sh Outdated

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread src/lib/onboard.ts Outdated

Comment thread src/lib/onboard.ts Outdated

Comment thread src/lib/onboard.ts Outdated

prekshivyas assigned cv and brandonpelfrey Apr 15, 2026

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread .github/workflows/pr.yaml Outdated

prekshivyas assigned prekshivyas and unassigned cv and brandonpelfrey Apr 15, 2026

prekshivyas requested a review from cv April 15, 2026 22:41

prekshivyas assigned brandonpelfrey Apr 15, 2026

prekshivyas requested a review from brandonpelfrey April 15, 2026 22:41

prekshivyas unassigned brandonpelfrey Apr 15, 2026

Merge branch 'main' into fix/ollama-auth-proxy

242c731

prekshivyas and others added 7 commits April 16, 2026 08:53

Merge branch 'main' into fix/ollama-auth-proxy

4f2edaf

fix(test): kill auth proxy in gpu-e2e cleanup

8aae2e2

Prevents lingering proxy process on reused runners after Phase 4.5g manually restarts the proxy for recovery testing. Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'main' into fix/ollama-auth-proxy

43d9212

Merge branch 'main' into fix/ollama-auth-proxy

a0d1ac8

brandonpelfrey approved these changes Apr 16, 2026

View reviewed changes

brandonpelfrey merged commit 4f30b9d into main Apr 16, 2026
9 checks passed

prekshivyas deleted the fix/ollama-auth-proxy branch April 16, 2026 17:36

prekshivyas mentioned this pull request Apr 16, 2026

fix(onboard): use 127.0.0.1 instead of localhost for local inference … #1716

Closed

3 tasks

miyoungc mentioned this pull request Apr 17, 2026

docs: catch up documentation for v0.0.18 changes #2033

Merged

13 tasks

ericksoa mentioned this pull request Apr 20, 2026

fix(security): migrate remaining shell-string callsites to argv arrays #1915

Merged

ericksoa mentioned this pull request Apr 20, 2026

fix: add auth proxy port to local-inference policy (#709) #2114

Merged

5 tasks

This was referenced Apr 21, 2026

test(e2e): add Brev-specific Ollama reachability suite (#1924) #2183

Closed

[brev] local ollama setup fails #1924

Closed

cv mentioned this pull request May 14, 2026

[macOS][Ollama] Local Ollama inference hangs through NemoClaw/OpenShell routing; sandbox readiness/status becomes unstable #1410

Closed

coderabbitai Bot mentioned this pull request May 19, 2026

fix(onboard): front Ollama with the auth proxy on WSL native Docker too #3732

Merged

12 tasks

Conversation

prekshivyas commented Apr 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

wscurran commented Apr 15, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

prekshivyas commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading