fix(onboard): gate host-network GPU local inference reachability (#4509) by yimoj · Pull Request #4609 · NVIDIA/NemoClaw

yimoj · 2026-06-01T08:11:07Z

Summary

The Docker-driver GPU host-network path recreates the sandbox with --network host and wires OpenClaw to the direct 127.0.0.1 Ollama/vLLM URL, but onboarding declared success without proving the recreated container could actually reach that endpoint. A failed host-network recreate, an unexpected non-host network mode, or a host provider binding/state problem only surfaced later as an opaque ECONNREFUSED during the first agent prompt. This adds a post-recreate reachability gate so onboarding fails early with actionable output.

Related Issue

Fixes #4509

Changes

Add verifyDockerGpuHostNetworkLocalInference in src/lib/onboard/docker-gpu-local-inference.ts: on the Docker-driver GPU host-network local-inference path it resolves the recreated OpenShell-managed container, asserts HostConfig.NetworkMode is host, and runs a bounded docker exec curl probe against the direct loopback health endpoint (/api/tags for Ollama, /v1/models for vLLM).
On failure, surface the selected endpoint, network mode, container id, and short recovery hints, then fail onboarding (no silent continue). The gate self-skips when the patch is opted out (NEMOCLAW_DOCKER_GPU_PATCH=0), when the network mode is not host, or — to avoid false negatives — when a minimal/custom image lacks curl (soft-skip with a warning).
Orchestrate via verifyGpuSandboxAfterReady so src/lib/onboard.ts stays net-neutral per the codebase-growth guardrail (logic lives under src/lib/onboard/).
Extend test/e2e/test-gpu-e2e.sh to assert the reachability proof when the direct sandbox URL is active, instead of only discovering failure during the agent prompt.
Add focused unit tests for the container-inspection / probe decision logic and the orchestrator.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npm test passes (unrelated e2e-scenario framework tests flaked on the shared host's default 5s timeout under concurrent load; they pass green at a 30s timeout in isolation)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
npm run build:cli and npm run typecheck:cli clean; biome check clean

Hardware-gated E2E gap

The full host-network proof requires an Ubuntu 24.04 + NVIDIA GPU + native Docker environment, which the triage host does not have. The container-inspection and probe decision logic is covered by unit tests with mocked Docker/OpenShell adapters; the live GPU host-network proof is exercised by test/e2e/test-gpu-e2e.sh on GPU hardware.

Signed-off-by: Yimo Jiang yimoj@nvidia.com

Summary by CodeRabbit

New Features
- GPU sandbox verification now runs an additional host-network reachability gate when applicable, with retries and clear, actionable failure diagnostics; onboarding may stop if this gate fails.
Tests
- Added comprehensive unit tests for host-network GPU local inference scenarios.
- Enhanced GPU end-to-end tests to validate host-network reachability when the direct sandbox URL path is used.

…DIA#4509) The Docker-driver GPU host-network path recreates the sandbox with --network host and wires OpenClaw to the direct 127.0.0.1 Ollama/vLLM URL, but onboarding declared success without proving the real container could reach that endpoint. A failed host-network recreate, an unexpected non-host network mode, or a host provider binding problem only surfaced later as an opaque ECONNREFUSED during an agent prompt. Add a post-recreate verification gate (verifyDockerGpuHostNetworkLocal Inference) that, only on the Docker-driver GPU host-network local inference path, resolves the recreated OpenShell-managed container, asserts HostConfig.NetworkMode is host, and runs a bounded docker exec curl probe against the direct loopback health endpoint (/api/tags for Ollama, /v1/models for vLLM). On failure it surfaces the endpoint, network mode, container id, and recovery hints, then fails onboarding early. Minimal/custom images lacking curl soft-skip with a warning instead of a false negative. The orchestration lives in src/lib/onboard/docker-gpu-local-inference.ts (verifyGpuSandboxAfterReady) so onboard.ts stays net-neutral per the codebase-growth guardrail. Extends test/e2e/test-gpu-e2e.sh to assert the reachability proof when the direct sandbox URL is active. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

coderabbitai · 2026-06-01T08:11:18Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dee57132-475b-4427-863f-2c97636decca

📥 Commits

Reviewing files that changed from the base of the PR and between 7f0388d and 28ce188.

📒 Files selected for processing (2)

src/lib/onboard/docker-gpu-local-inference.test.ts
src/lib/onboard/docker-gpu-local-inference.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/lib/onboard/docker-gpu-local-inference.test.ts
src/lib/onboard/docker-gpu-local-inference.ts

📝 Walkthrough

Walkthrough

The PR adds a post-ready reachability verification gate for GPU sandboxes using Docker host-network patching. Onboarding now runs the GPU proof and, when applicable, verifies the recreated container can reach the provider's local inference endpoint via curl probing with retries, emitting diagnostics or exiting on failure.

Changes

Docker GPU host-network reachability verification

Layer / File(s)	Summary
Host-network verification infrastructure `src/lib/onboard/docker-gpu-local-inference.ts`	Reworked imports and added timeout/retry constants, extended `DockerGpuLocalInferenceOptions` with optional `platform` parameter, and refactored `shouldUseDockerGpuPatchHostNetwork` to pass environment and platform context to the patch decision helper. Exported types for verification dependencies and results.
Host-network inference verification implementation `src/lib/onboard/docker-gpu-local-inference.ts`	Implemented `verifyDockerGpuHostNetworkLocalInference` as a gated verification step: skips when patch is inactive or provider is non-local, resolves the recreated container, validates host-network mode, probes reachability with curl and structured retries, soft-skips if curl is missing, otherwise returns success or failure with recovery guidance. Added `printDockerGpuHostNetworkInferenceVerificationFailure` to emit formatted diagnostics including container, network mode, endpoint, and recovery steps.
GPU sandbox post-ready orchestrator `src/lib/onboard/docker-gpu-local-inference.ts`	Implemented `verifyGpuSandboxAfterReady` to orchestrate GPU proof and optional host-network reachability gate: runs direct GPU proof first, skips host-network verification if patch is inactive, otherwise runs reachability verifier, logs success or prints diagnostics and exits with code 1 on failure.
Onboard flow GPU verification integration `src/lib/onboard.ts`	Replaced direct `verifyDirectSandboxGpu(sandboxName)` try/catch with `dockerGpuLocalInference.verifyGpuSandboxAfterReady(...)`, passing sandbox name, gateway, patch state, verifier, selected mode, and `runCaptureOpenshell` context to coordinate both GPU proof and host-network readiness gating.
Host-network verification test suite `src/lib/onboard/docker-gpu-local-inference.test.ts`	Added comprehensive Vitest coverage: `shouldUseDockerGpuPatchHostNetwork` behavior for Linux Docker-driver host-network path only; `verifyDockerGpuHostNetworkLocalInference` skip conditions (inactive patch, non-local provider, missing patch), successful host-network reconciliation with endpoint probe, failure cases for missing/incorrect-mode containers, probe retry behavior with sleep timing, soft-skip when curl is unavailable; `verifyGpuSandboxAfterReady` orchestration with GPU proof + inference gate, failure routing through error sink + exit, and gate bypass when patch is inactive; `printDockerGpuHostNetworkInferenceVerificationFailure` formatted output with container, network mode, endpoint, and recovery hints.
E2E GPU host-network verification gate `test/e2e/test-gpu-e2e.sh`	Added conditional log-based validation that checks for the direct sandbox URL onboarding message and requires a corresponding host-network local inference reachability confirmation, failing the test if the direct URL path is active but reachability is not proven.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

nemoclaw onboarding with gpu fails because sandbox transitions to Error phase after Docker GPU patch #4316: The PR refactors GPU verification from a direct call to a centralized orchestrator and adds host-network reachability checks, directly addressing GPU-proof-related sandbox Error-phase failures.

Poem

🐰 I hopped through logs and curl's small cheer,

Containers greet hosts, no longer fear,
Proof and probe now dance in line,
Sandboxes ready, signals fine.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding a reachability gate for host-network GPU local inference, which is the core objective of this PR.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#4509` by implementing a reachability verification gate that ensures GPU host-network sandboxes can reach local Ollama/vLLM endpoints before onboarding completes.
Out of Scope Changes check	✅ Passed	All changes are focused and scoped to implementing the host-network reachability verification gate, with no unrelated modifications introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard/docker-gpu-local-inference.ts`:
- Around line 293-297: The skip-path currently only calls options.log?.(...) so
the curl-missing warning is dropped when no logger is provided; update the
branch that checks containerHasCurl(containerId, dockerRunFn) to always emit the
warning (e.g., call console.warn or console.info) in addition to calling
options.log?.(...) so the operator always sees why the reachability probe was
skipped; keep the return { status: "skipped", reason: "probe-tool-unavailable" }
unchanged and make sure you reference the existing containerHasCurl,
dockerRunFn, and options.log? symbols.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5d0b971f-a5b6-499d-a4e9-c3792235bd2f

📥 Commits

Reviewing files that changed from the base of the PR and between df7d054 and 7f0388d.

📒 Files selected for processing (4)

src/lib/onboard.ts
src/lib/onboard/docker-gpu-local-inference.test.ts
src/lib/onboard/docker-gpu-local-inference.ts
test/e2e/test-gpu-e2e.sh

…DIA#4509) Address CodeRabbit review on PR NVIDIA#4609: the curl-missing soft-skip used options.log?.() which silently dropped the warning when no logger was wired, leaving the operator with no explanation for why the reachability proof was skipped. Fall back to console.warn so the skip is always visible. Also add concise docstrings to the new helper functions to clear the docstring-coverage warning. Signed-off-by: Yimo Jiang <yimoj@nvidia.com>

wscurran · 2026-06-01T14:50:47Z

✨
Related open issues:

#4509 [Ubuntu 24.04][Inference] GPU host-network sandbox cannot reach local Ollama provider

## Summary - Adds the v0.0.56 release notes section with links to the deeper docs pages for installer, status, inference, messaging, policy, and lifecycle changes. - Updates source docs for the remaining release-prep gaps around `uv` in the PyPI preset, compact WhatsApp pairing guidance, and `nemoclaw inference set` command boundaries. - Refreshes generated `nemoclaw-user-*` skills and removes skipped experimental command terms from generated skill surfaces. ## Source summary - #4613 -> `docs/manage-sandboxes/lifecycle.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents that public installs and `nemoclaw update` follow the maintained `lkg` tag by default. - #4419 -> `docs/about/release-notes.mdx`: Notes that non-interactive Linux installs can reactivate Docker group membership and continue in one installer run when `sg docker` is available. - #4550 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures live sandbox agent-version probing for status, connect, and upgrade checks. - #4609 -> `docs/inference/use-local-inference.mdx`, `docs/about/release-notes.mdx`: Captures the GPU Docker-driver host-network local-inference reachability gate. - #4607 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents compact WhatsApp QR pairing guidance and gateway/session diagnostics. - #4582 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Reflects Slack credential validation before enabling the channel. - #4554 -> `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/troubleshooting.mdx`, `docs/about/release-notes.mdx`: Keeps Telegram allowlist alias guidance in the generated user skills and release notes. - #4563 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Includes the new `nemoclaw <name> skill remove <skill>` command in command docs and release notes. - #4566 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents the `nemoclaw inference set` redirect boundary when `--provider` or `--model` is missing. - #4323 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures per-sandbox status JSON support. - #4506 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures debug command sandbox-name validation and safer tarball writing. - #4569 -> `docs/network-policy/integration-policy-examples.mdx`, `docs/about/release-notes.mdx`: Documents that the `pypi` preset allows `/usr/local/bin/uv`. - #4579 -> `docs/network-policy/integration-policy-examples.mdx`, `docs/about/release-notes.mdx`: Captures observable Jira preset validation guidance. - #4229 -> `docs/manage-sandboxes/lifecycle.mdx`, `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents user-data preservation defaults for uninstall. - #4399 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures CPU-only sandbox intent preservation across rebuilds. - #4058 -> `docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Captures safer snapshot restore behavior around existing destinations. - #4155 and #4460 -> skipped by `docs/.docs-skip`: Removed skipped experimental command terms from source docs and generated skill evals instead of documenting those features. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `npm run docs` (passes; Fern reports the pre-existing light-mode accent contrast warning) - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" .agents/skills` (no matches) - `npm run build:cli` (run to refresh local CLI artifacts for the pre-push TypeScript hook) - Commit hooks passed, including `NEMOCLAW_* env-var documentation gate`, `Verify docs-to-skills output`, `markdownlint-cli2`, `gitleaks`, and `Test (skills YAML)`.  ## Summary by CodeRabbit * **Documentation** * Expanded Model Router setup with YAML examples, flow diagrams, and credential handling; strengthened agent-config immutability and integrity guidance; messaging channels updated (Telegram aliases, WhatsApp pairing/diagnostics); CLI docs revised (GPU detection, inference set behavior, uninstall/rebuild preservation); overview rebranded to NemoClaw and added v0.0.56 release notes. * **New Features** * Added `nemoclaw <name> channels status` (messaging diagnostics, JSON); added `nemoclaw <name> skill remove`; Hermes no longer marked experimental; DGX Spark quickstart sandbox-name note.

## Summary - Add the missing `v0.0.57` release-notes section with links to the detailed docs pages for command, inference, onboarding, messaging, status, installer, and policy changes. - Remove public references to docs-skip terms from source docs and regenerate the NemoClaw user skills from the current Fern MDX docs. - Carry forward generated references for the per-agent documentation split, including Hermes-specific reference files. ## Source summary - #4615 and #4653 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover host-side `sessions` and `agents` commands plus `NEMOCLAW_EXTRA_AGENTS_JSON` secondary-agent baking. - #4163, #4204, #4611, #4619, and #4676 -> `docs/about/release-notes.mdx`, `docs/inference/use-local-inference.mdx`: Release notes now cover managed vLLM progress/readiness, DGX Spark model default changes, local Ollama streaming usage, and inference route divergence warnings. - #4267, #4601, #4609, #4642, #4645, and #4661 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover UFW auto-remediation, local-inference reachability gates, gateway reuse/binding, cancel rollback, and policy selection persistence. - #4577, #4582, #4607, and #4660 -> `docs/about/release-notes.mdx`, `docs/manage-sandboxes/messaging-channels.mdx`: Release notes now cover Slack validation, atomic `channels add`, WhatsApp QR diagnostics, and Slack placeholder normalization. - #4388, #4600, #4646, and #4647 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover status failure layers, paused-container hints, Docker-driver doctor behavior, and non-destructive stale-registry recovery. - #4569, #4579, and #4678 -> `docs/about/release-notes.mdx`, `docs/manage-sandboxes/lifecycle.mdx`, `docs/network-policy/integration-policy-examples.mdx`: Release notes now cover installer tag pinning, PyPI `uv` policy access, and observable Jira validation. - #4632 -> `.agents/skills/`: Regenerated user skills from the current per-agent docs source, including newly generated Hermes reference files. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" docs --glob "*.mdx"` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" .agents/skills --glob "*.md"` - `npm run docs` - `npm run build:cli` - Commit hooks: markdownlint, docs-to-skills verification, gitleaks, skills YAML, commitlint  ## Summary by CodeRabbit * **Documentation** * Restructured documentation to clearly distinguish OpenClaw and Hermes agent variants throughout user guides. * Enhanced security, credential storage, and deployment guidance with clearer setup flows. * Added Hermes plugin installation and ecosystem documentation. * Improved workspace, messaging, and policy management references with variant-specific command examples. * Refined troubleshooting and CLI reference sections for clarity.

NVIDIA#4509) PR NVIDIA#4609 verified host-network GPU local inference with `docker exec` against the recreated `--network host` container, whose main network namespace IS the host's — so the probe passed while the OpenClaw agent, which runs in OpenShell's isolated sandbox network namespace, still got ECONNREFUSED on the direct 127.0.0.1 provider URL. The sandbox namespace cannot reach the host loopback even under `--network host` (see detectSandboxFallbackDns), so the direct-loopback wiring was unreachable. - Never pin OpenClaw to a direct container-loopback inference URL; for local providers, downgrade an opted-in host-network GPU patch to the OpenShell bridge so inference routes through the reachable inference.local path (host networking is not needed for GPU access). - Re-run the sandbox bridge reachability probe (with UFW auto-fix) after the downgrade, since gateway startup skipped it under host mode. - Replace the docker-exec gate with a runtime-context probe via `openshell sandbox exec` that hits inference.local exactly as the agent does, requiring 2xx; 000/4xx/5xx fail with actionable recovery. Soft-skip only when the sandbox image genuinely lacks curl. - Update the GPU E2E to prove inference through `openshell sandbox exec` (the real runtime), removing the docker-exec shortcut that masked the bug. Signed-off-by: Yimo Jiang <yimoj@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

#4509) (#5024) ## Summary Reopened #4509: on an Ubuntu 24.04 GPU host-network setup, onboard printed "local inference reachable" yet the agent then failed with `ECONNREFUSED` / "LLM request failed: network connection error". PR #4609 proved reachability with `docker exec` against the recreated `--network host` container — whose *main* network namespace is the host's — but OpenClaw runs in OpenShell's **isolated sandbox network namespace**, which cannot reach the host loopback even under `--network host`. So the direct `127.0.0.1` provider URL was unreachable for the agent while the probe falsely passed. This fixes the URL/network mapping and verifies it from the real runtime context. ## Related Issue Fixes #4509 ## Changes - **No direct container-loopback inference URL.** For local providers, an opted-in host-network GPU patch (`NEMOCLAW_DOCKER_GPU_PATCH_NETWORK=host`) is downgraded to the OpenShell bridge so inference routes through the reachable `inference.local` path. Host networking is unnecessary for GPU device access (that comes from the GPU mode flags). Non-local (cloud/routed/custom) GPU sandboxes are untouched. - **Bridge reachability re-checked after the downgrade** (with UFW auto-fix), since gateway startup skipped that probe while host networking was still requested. - **Runtime-context reachability gate.** The post-ready gate now probes `https://inference.local/v1/models` via `openshell sandbox exec` — the exact network namespace and route OpenClaw uses — instead of `docker exec`. Success requires a `2xx`; `000` (ECONNREFUSED), `4xx` (route/auth misconfig), and `5xx` (backend down) fail with actionable recovery. A genuinely missing `curl` soft-skips (OpenClaw's HTTP client does not need it); a broken sandbox exec path fails rather than masquerading as missing-curl. - **GPU E2E** (`test/e2e/test-gpu-e2e.sh`) now proves inference through `openshell sandbox exec` (the real runtime) and asserts the new gate, removing the `docker exec` shortcut that masked the bug. - `src/lib/onboard.ts` stays net-neutral (orchestration lives in `src/lib/onboard/`). ## Type of Change - [x] Code change (feature, bug fix, or refactor) ## Verification - [x] `npx prek run --files` on the changed files (TS/biome/spdx/shellcheck clean; the only failures were unrelated env-flakes — missing plugin `node_modules` and 5s CLI-spawn timeouts under a loaded host — which pass with deps installed and a normal timeout: 152/152) - [x] `npm run build:cli`, `npm run typecheck:cli` - [x] `npx vitest run` for the gate (21), `test/onboard.test.ts` (66), `docker-gpu-patch` (50), `inference/local` (65), `provider-inference` (13), `docker-gpu-sandbox-create` (5) - [x] Tests added/updated for new and changed behavior (runtime-context probe, 2xx-only, local-only downgrade + bridge re-check, exec-failure vs missing-curl) - [x] No secrets, API keys, or credentials committed ### Reporter-workflow E2E evidence Full reporter reproduction requires Ubuntu 24.04 + NVIDIA GPU + native Docker (host-network GPU patch), which is not available on this CI-less dev host. The exact workflow is covered by the **GPU pipeline E2E** (`test/e2e/test-gpu-e2e.sh`, Brev GPU runner), which this PR extends to verify local inference **through `openshell sandbox exec`** (the agent runtime netns) and to assert the runtime-context gate — so a future regression cannot pass via the container-main-namespace shortcut. The root-cause *mechanism* was reproduced locally and hermetically (no GPU needed), modeling the OpenShell Docker-driver topology — a `--network host` container plus an inner `unshare -n` namespace (how OpenShell runs the sandbox agent): ``` [A] container MAIN netns (== host loopback under --network host; what docker exec / PR #4609 hit): http_code=200 RESULT: OK-MAIN (reaches host Ollama) [B] INNER netns via unshare -n (== OpenShell sandbox agent runtime / openshell sandbox exec): http_code=000 RESULT: FAIL-INNER (ECONNREFUSED — matches the reporter) ``` This confirms why the `docker exec` probe passed while the agent got `ECONNREFUSED`, and why routing through the OpenShell-managed `inference.local` path (on the bridge) is the reachable fix. --- Signed-off-by: Yimo Jiang <yimoj@nvidia.com>  ## Summary by CodeRabbit * **Bug Fixes** * Verify GPU local inference from inside the sandbox runtime (not via host-network probes), reducing false positives and handling curl/unreachability scenarios more robustly. * **Refactor** * Default Docker GPU patching for local providers now uses the OpenShell-managed bridge instead of host networking to improve inference accessibility and consistency. * **Tests** * End-to-end and unit tests updated to exercise the sandbox-side inference path and cover success, skip, retry, and failure cases.  Signed-off-by: Yimo Jiang <yimoj@nvidia.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread src/lib/onboard/docker-gpu-local-inference.ts

yimoj added the v0.0.56 Release target label Jun 1, 2026

wscurran added Docker provider: ollama Ollama local model provider behavior labels Jun 1, 2026

cv added v0.0.57 Release target and removed v0.0.56 Release target labels Jun 1, 2026

cv approved these changes Jun 1, 2026

View reviewed changes

cv merged commit ca410b5 into NVIDIA:main Jun 1, 2026
30 checks passed

miyoungc mentioned this pull request Jun 1, 2026

docs: refresh 0.0.56 release documentation #4618

Merged

coderabbitai Bot mentioned this pull request Jun 1, 2026

fix(onboard): classify Docker GPU patch Error-phase failure (#4316) #4407

Merged

9 tasks

wscurran added area: packaging Packages, images, registries, installers, or distribution bug-fix PR fixes a bug or regression platform: container Affects Docker, containerd, Podman, or images and removed area: packaging Packages, images, registries, installers, or distribution labels Jun 3, 2026

coderabbitai Bot mentioned this pull request Jun 3, 2026

fix(inference): prove WSL Docker Desktop GPUs and report sandbox CUDA proof state #4599

Merged

8 tasks

wscurran removed Docker labels Jun 3, 2026

miyoungc mentioned this pull request Jun 3, 2026

docs: refresh 0.0.57 release docs #4716

Merged

yimoj mentioned this pull request Jun 9, 2026

fix(onboard): prove GPU sandbox local inference from the agent runtime (#4509) #5024

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): gate host-network GPU local inference reachability (#4509)#4609

fix(onboard): gate host-network GPU local inference reachability (#4509)#4609
cv merged 2 commits into
NVIDIA:mainfrom
yimoj:fix/4509-host-network-ollama-proof

yimoj commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

wscurran commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yimoj commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Hardware-gated E2E gap

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wscurran commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yimoj commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading