fix(onboard): relaunch Windows Ollama after install#3330
Conversation
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR adds a PowerShell-safe quoting helper, refactors Windows Ollama relaunch to accept an installed binary path, retries launch using the installer-discovered path on readiness failure, centralizes timeout diagnostics, and updates onboarding to clear cached host state on fallback. ChangesWindows Ollama Installation & Startup Retry
Sequence Diagram(s)sequenceDiagram
participant OnboardCLI
participant WindowsInstaller
participant PowerShell
participant HostProbe as host.docker.internal:11434
OnboardCLI->>WindowsInstaller: install Ollama (returns installedPath)
WindowsInstaller-->>OnboardCLI: installedPath
OnboardCLI->>PowerShell: Start-Process (OLLAMA_HOST=0.0.0.0:11434)
PowerShell->>HostProbe: awaitWindowsOllamaReady probe (curl)
alt probe fails
OnboardCLI->>PowerShell: Start-Process with installedPath (-ArgumentList 'serve')
PowerShell->>HostProbe: awaitWindowsOllamaReady probe (retry)
alt retry fails
OnboardCLI->>OnboardCLI: printWindowsOllamaTimeoutDiagnostics()
OnboardCLI->>OnboardCLI: resetOllamaHostCache() and fallback/exit
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/lib/inference/ollama/windows.ts (1)
203-207: ⚡ Quick winHarden printed diagnostics to avoid hanging during triage.
The timeout diagnostics should print a bounded curl command so troubleshooting output remains fast and deterministic.
Suggested diff
- console.error(' powershell.exe -Command "Get-Process ollama*"'); + console.error(' powershell.exe -Command "Get-Process ollama* -ErrorAction SilentlyContinue"'); console.error( - ' powershell.exe -Command "Get-NetTCPConnection -LocalPort 11434 -State Listen"', + ' powershell.exe -Command "Get-NetTCPConnection -LocalPort 11434 -State Listen -ErrorAction SilentlyContinue"', ); - console.error(` curl http://host.docker.internal:${OLLAMA_PORT}/api/tags`); + console.error( + ` curl -sS --connect-timeout 2 --max-time 5 http://host.docker.internal:${OLLAMA_PORT}/api/tags`, + );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/inference/ollama/windows.ts` around lines 203 - 207, The printed curl diagnostic currently can hang; update the console.error that prints the curl command (the one using OLLAMA_PORT) to include bounded timeout flags (e.g. --max-time and --connect-timeout, and optionally --fail and -sS) so the troubleshooting curl is fast and deterministic; locate the console.error that outputs `curl http://host.docker.internal:${OLLAMA_PORT}/api/tags` in src/lib/inference/ollama/windows.ts and replace the unbounded example with a bounded curl invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/lib/inference/ollama/windows.ts`:
- Around line 203-207: The printed curl diagnostic currently can hang; update
the console.error that prints the curl command (the one using OLLAMA_PORT) to
include bounded timeout flags (e.g. --max-time and --connect-timeout, and
optionally --fail and -sS) so the troubleshooting curl is fast and
deterministic; locate the console.error that outputs `curl
http://host.docker.internal:${OLLAMA_PORT}/api/tags` in
src/lib/inference/ollama/windows.ts and replace the unbounded example with a
bounded curl invocation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 27daadcc-282b-4622-a932-af9621317bcd
📒 Files selected for processing (3)
src/lib/inference/ollama/windows.tssrc/lib/onboard.tstest/onboard-selection.test.ts
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lib/inference/ollama/windows.ts (1)
149-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick winWatcher launch failure should fall back to
installedPathbefore failingWhen both paths exist, Line 149 gives watcher priority; if that launch fails (Lines 166-172), the function returns
falseimmediately and never attempts the verifiedinstalledPath. This can still bounce onboarding back to provider selection on stale watcher-path systems.Suggested fix
function launchAndAwaitWindowsOllama( opts: { watcherPath?: string; installedPath?: string } = {}, ): boolean { console.log(" Starting Ollama on Windows host via WSL interop..."); const watcherPath = typeof opts.watcherPath === "string" ? opts.watcherPath.trim() : ""; const installedPath = typeof opts.installedPath === "string" ? opts.installedPath.trim() : ""; - let launchScript: string; - if (watcherPath) { - launchScript = - `$env:OLLAMA_HOST='0.0.0.0:11434'; Start-Process -FilePath ${psSingleQuote(watcherPath)} ` + - "-WindowStyle Hidden"; - } else if (installedPath) { - launchScript = - `$env:OLLAMA_HOST='0.0.0.0:11434'; Start-Process -FilePath ${psSingleQuote(installedPath)} ` + - "-ArgumentList 'serve' -WindowStyle Hidden"; - } else { - launchScript = - "$env:PATH = [Environment]::GetEnvironmentVariable('PATH','Machine') + ';' + [Environment]::GetEnvironmentVariable('PATH','User'); " + - "$env:OLLAMA_HOST='0.0.0.0:11434'; Start-Process -FilePath ollama.exe -ArgumentList serve -WindowStyle Hidden"; - } - const result = run(["powershell.exe", "-Command", launchScript], { - ignoreError: true, - suppressOutput: true, - }); - if (result.status !== 0) { - const stderr = String(result.stderr || "").trim(); - console.error( - ` PowerShell launch failed (exit ${result.status})${stderr ? `: ${stderr}` : ""}`, - ); - return false; - } - return awaitWindowsOllamaReady(); + const launchScripts: string[] = []; + if (watcherPath) { + launchScripts.push( + `$env:OLLAMA_HOST='0.0.0.0:11434'; Start-Process -FilePath ${psSingleQuote(watcherPath)} -WindowStyle Hidden`, + ); + } + if (installedPath) { + launchScripts.push( + `$env:OLLAMA_HOST='0.0.0.0:11434'; Start-Process -FilePath ${psSingleQuote(installedPath)} -ArgumentList 'serve' -WindowStyle Hidden`, + ); + } + if (launchScripts.length === 0) { + launchScripts.push( + "$env:PATH = [Environment]::GetEnvironmentVariable('PATH','Machine') + ';' + [Environment]::GetEnvironmentVariable('PATH','User'); " + + "$env:OLLAMA_HOST='0.0.0.0:11434'; Start-Process -FilePath ollama.exe -ArgumentList serve -WindowStyle Hidden", + ); + } + + for (const launchScript of launchScripts) { + const result = run(["powershell.exe", "-Command", launchScript], { + ignoreError: true, + suppressOutput: true, + }); + if (result.status === 0 && awaitWindowsOllamaReady()) { + return true; + } + } + return false; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/inference/ollama/windows.ts` around lines 149 - 172, The current logic gives watcherPath priority and returns false on its launch failure, skipping a possible fallback to installedPath; change the sequence in the block that builds launchScript and calls run so that if watcherPath is present you attempt launching it first (using the existing watcherPath launchScript and run call) but if that run returns a non-zero status and installedPath is available, rebuild launchScript for installedPath and call run again; only return false if both attempts fail (preserve the same error logging using result.stderr), and keep the original behavior when only installedPath or neither path exists.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/lib/inference/ollama/windows.ts`:
- Around line 149-172: The current logic gives watcherPath priority and returns
false on its launch failure, skipping a possible fallback to installedPath;
change the sequence in the block that builds launchScript and calls run so that
if watcherPath is present you attempt launching it first (using the existing
watcherPath launchScript and run call) but if that run returns a non-zero status
and installedPath is available, rebuild launchScript for installedPath and call
run again; only return false if both attempts fail (preserve the same error
logging using result.stderr), and keep the original behavior when only
installedPath or neither path exists.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8ecb5b2f-5aa2-4b5b-9584-ca8f148efdbf
📒 Files selected for processing (1)
src/lib/inference/ollama/windows.ts
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
ericksoa
left a comment
There was a problem hiding this comment.
Reviewed the Windows-host Ollama install fallback path and the focused regression coverage. The change is scoped to WSL/Windows Ollama onboarding, preserves the existing watcher-first restart behavior, falls back to the verified installed ollama.exe path when auto-start is not reachable, and the no-commit merge with current main passed local validation.
## Summary Refreshes the release-prep docs for v0.0.39 based on changes merged since the Friday 4pm doc refresh. Updates the source docs, bumps the docs version metadata, and regenerates the NemoClaw user skills from the refreshed docs. ## Changes - #3314 -> `docs/get-started/prerequisites.md`, `docs/get-started/quickstart.md`, `docs/reference/troubleshooting.md`: Documents installer Docker setup, Docker group activation, and retry guidance. - #3317 -> `docs/get-started/quickstart.md`, `docs/reference/commands.md`: Documents the DGX Spark and DGX Station express install prompt and `NEMOCLAW_NO_EXPRESS`. - #3328 and #3329 -> `docs/security/best-practices.md`, `docs/deployment/sandbox-hardening.md`: Updates sandbox capability hardening docs for the stricter bounding-set and `setpriv` step-down behavior. - #3330, #3335, and #3346 -> `docs/inference/use-local-inference.md`: Documents Windows-host Ollama relaunch behavior, NIM key passthrough, early health-fail diagnostics, and mixed-GPU preflight detail. - #2406, #2883, #3001, #3244, #3267, #3318, #3320, and #3354 -> `docs/about/release-notes.md`: Adds the v0.0.39 release-prep section while keeping the v0.0.38 release notes intact. - Advances the release-prep docs metadata from v0.0.38 to v0.0.39. - Regenerates `.agents/skills/nemoclaw-user-*` from the updated source docs. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes - [ ] `npm test` passes - [ ] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [x] `make docs` builds without warnings (doc changes only) - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes v0.0.39 * **New Features** * Host alias management commands for easier configuration * Sandbox GPU control options during onboarding * Update command with check and confirmation modes * **Documentation** * Enhanced Linux installer guidance with Docker and group membership handling * Expanded troubleshooting for permission and connectivity issues * Improved capability-dropping security documentation * Updated inference model switching commands * Brev environment-specific troubleshooting * **Improvements** * DGX Spark/Station express install flow * Windows Ollama relay and health-check enhancements * NVIDIA NIM preflight GPU reporting [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3375) <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Make the WSL Windows-host Ollama install path explicitly relaunch Ollama when the installer does not leave a reachable daemon. This prevents onboarding from returning to the provider menu after a successful Windows-host install whose auto-start side effect failed.
Related Issue
Fixes #3262
Changes
ollama.exepath into the robust Windows-host restart helper after install readiness fails.OLLAMA_HOST=0.0.0.0:11434, stop stale Windows Ollama watcher/daemon processes in the existing safe order, and launch either the watcher path or verified install path.Get-Process,Get-NetTCPConnection, andhost.docker.internalcurl probes.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Additional verification performed:
npm run build:clipasses.npx vitest run test/onboard-selection.test.ts -t "restarts Windows-host Ollama"passes.npx vitest run test/onboard-selection.test.tspasses.npm run typecheck:clipasses.npm run lintpasses.npm testwas run but does not pass in this worktree due unrelated failures intest/install-preflight.test.tsandtest/fetch-guard-patch-regression.test.ts; the fetch-guard failure was reproduced independently as a non-root permission error removing/usr/local/lib/node_modules/openclaw.npx prek run --all-fileswas run after installing localhadolint; it fails only in the CLI coverage hook on the same unrelatedtest/fetch-guard-patch-regression.test.tsfailure.Signed-off-by: Yimo Jiang yimoj@nvidia.com
Summary by CodeRabbit
New Features
Bug Fixes
Tests