fix(onboard): make the Ollama model pull timeout configurable#3037
Conversation
The Ollama model pull was wrapped with a hardcoded 10-minute wall-clock timeout that fires regardless of progress, so any sustained download slower than ~8 MB/s could not complete the default 4.7 GB qwen2.5:7b pull. Each retry resumed from the last layer but still hit the same SIGTERM at the next 10-minute boundary. Replace the hardcoded value in both pullOllamaModelViaCli and pullOllamaModelViaHttp with a single getOllamaPullTimeoutMs() helper that reads NEMOCLAW_OLLAMA_PULL_TIMEOUT (in seconds) and falls back to a 30-minute default. Update the CLI and HTTP timeout error messages to mention the env var override and that already-downloaded layers are kept across retries. Add unit coverage for the helper's parsing behaviour: default fallback for unset/empty/non-numeric/non-positive inputs, and millisecond conversion for valid numeric inputs. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis change adds environment-variable configurability for Ollama model-pull timeouts via ChangesTimeout Configurability & Error Messaging
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lib/onboard-ollama-proxy.ts (1)
354-366:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove
Math.floor()to preserve timeout precision between HTTP and CLI paths.The HTTP pull path converts timeout to seconds for curl via
Math.floor(TIMEOUT_MS / 1000), which truncates fractional values and can produce0for sub-second timeouts (disabling the timeout entirely per curl docs). This creates cross-path behavior drift: the CLI path uses the full millisecond precision fromgetOllamaPullTimeoutMs(), while the HTTP path floors it.Proposed fix
- const TIMEOUT_MS = getOllamaPullTimeoutMs(); + const TIMEOUT_MS = getOllamaPullTimeoutMs(); + const timeoutSeconds = TIMEOUT_MS / 1000; @@ - String(Math.floor(TIMEOUT_MS / 1000)), + String(timeoutSeconds),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard-ollama-proxy.ts` around lines 354 - 366, The CLI curl spawn is passing a floored seconds value to "--max-time" which truncates fractional seconds and can become "0"; change the argument construction so it preserves millisecond precision by removing Math.floor and passing the exact seconds string (e.g. String(TIMEOUT_MS / 1000)) when building the spawn args for "--max-time" (see TIMEOUT_MS, getOllamaPullTimeoutMs, and the spawn call with the "--max-time" argument).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/lib/onboard-ollama-proxy.ts`:
- Around line 354-366: The CLI curl spawn is passing a floored seconds value to
"--max-time" which truncates fractional seconds and can become "0"; change the
argument construction so it preserves millisecond precision by removing
Math.floor and passing the exact seconds string (e.g. String(TIMEOUT_MS / 1000))
when building the spawn args for "--max-time" (see TIMEOUT_MS,
getOllamaPullTimeoutMs, and the spawn call with the "--max-time" argument).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ed39b1ee-a051-48e4-a120-163c813e20e3
📒 Files selected for processing (2)
src/lib/onboard-ollama-proxy.tstest/ollama-pull-timeout.test.ts
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
ericksoa
left a comment
There was a problem hiding this comment.
Looks good. I verified this against current main, including the linked Ollama timeout issue and CodeRabbit's fractional-timeout concern.
The PR now keeps the configurable timeout shared across the CLI and HTTP pull paths, preserves sub-second precision for curl, and covers the parser plus HTTP-path wiring with focused tests. Local validation passed (
nemoclaw@0.1.0 build:cli
tsc -p tsconfig.src.json && tsc -p nemoclaw-blueprint/tsconfig.json, targeted Ollama timeout/proxy tests, and the Ollama onboarding selection slice), and the current PR head is green across the visible GitHub checks.
#3122) ## Summary #3116 reports that `NEMOCLAW_OLLAMA_PULL_TIMEOUT` was added in code by #3037 (`PULL_TIMEOUT_ENV` constant in `src/lib/onboard-ollama-proxy.ts`, gate inside `getOllamaPullTimeoutMs`) but was never added to `docs/reference/commands.md`. Users hitting the 30-minute default on slow links cannot discover how to raise it; the only mention is the in-flight error hint when the timeout actually fires. ## Problem `NEMOCLAW_OLLAMA_PULL_TIMEOUT` is read by `getOllamaPullTimeoutMs()` (defaults to 30 minutes / 1800 seconds, accepts integer or float seconds, validated for finite-positive). It controls the wall-clock cap on `ollama pull` during onboard. Without docs, users only learn the variable name from the timeout-error hint after the failure. ## Changes - Add a "Onboard timeouts" subsection to the existing Environment Variables section of `docs/reference/commands.md`. New table row documents the variable, default (`1800` / 30 minutes), accepted format (integer or float seconds), and the partial-download-resume behavior. One-line console example. - Mirror the same addition into `.agents/skills/nemoclaw-user-reference/references/commands.md` so the auto-generated skill stays aligned with the source. ## Type of Change - [x] Code change with doc updates - [x] Doc only (prose changes, no code sample modifications) ## Test plan - [x] No code change; no behavioral test added. - [x] Doc claims (default, format, behavior) match the implementation in `src/lib/onboard-ollama-proxy.ts:311-330` and the error-hint text at `pullTimeoutErrorHint`. - [x] Skill mirror addition is identical to the source addition (table row + console example + trailing paragraph). Closes #3116 --- Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Added an "Onboard timeouts" section explaining timeout controls for model pulls during onboarding. * Introduced NEMOCLAW_OLLAMA_PULL_TIMEOUT (default: 1800 seconds). Notes: accepts integer or float values, reported in minutes on timeout, preserves already-downloaded layers so retries resume, and includes configuration examples and guidance for increasing the limit. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: cjagwani <cjagwani@nvidia.com> Co-authored-by: J. Yaunches <jyaunches@nvidia.com>
Summary
The Ollama model pull was wrapped with a hardcoded 10-minute wall-clock timeout in both pull paths, so sustained network throughput below ~8 MB/s could not complete the default 4.7 GB qwen2.5:7b pull (each retry resumed from the last layer but still hit SIGTERM at the next 10-minute boundary). Replace the hardcoded value with a
NEMOCLAW_OLLAMA_PULL_TIMEOUTenv-var override (seconds) defaulting to 30 minutes, and surface the override hint in the timeout error messages.Related Issue
Closes #2610
Changes
DEFAULT_OLLAMA_PULL_TIMEOUT_MS(30 minutes),getOllamaPullTimeoutMs()(parsesNEMOCLAW_OLLAMA_PULL_TIMEOUTin seconds with fallback for unset / empty / non-numeric / non-positive inputs), andpullTimeoutErrorHint()(prints elapsed minutes, the resume-on-retry note, and the env var hint) insrc/lib/onboard-ollama-proxy.ts.pullOllamaModelViaCli(replacing the hardcodedtimeout: 600_000onspawnSync) andpullOllamaModelViaHttp(replacing the hardcodedTIMEOUT_MS = 600_000on the curl--max-timedeadline). The HTTP-path non-timeout error path retains its existing wording.getOllamaPullTimeoutMsso the unit tests can exercise the parser without shelling out toollama pull.test/ollama-pull-timeout.test.tscovering: default fallback when unset, empty, or whitespace; positive integer seconds → ms conversion; truncated fractional seconds; non-numeric fallback; zero/negative fallback.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: Tinson Lai tinsonl@nvidia.com
Summary by CodeRabbit
New Features
Tests