fix(onboard): make the Ollama model pull timeout configurable by laitingsheng · Pull Request #3037 · NVIDIA/NemoClaw

laitingsheng · 2026-05-05T12:33:34Z

Summary

The Ollama model pull was wrapped with a hardcoded 10-minute wall-clock timeout in both pull paths, so sustained network throughput below ~8 MB/s could not complete the default 4.7 GB qwen2.5:7b pull (each retry resumed from the last layer but still hit SIGTERM at the next 10-minute boundary). Replace the hardcoded value with a NEMOCLAW_OLLAMA_PULL_TIMEOUT env-var override (seconds) defaulting to 30 minutes, and surface the override hint in the timeout error messages.

Related Issue

Closes #2610

Changes

Add DEFAULT_OLLAMA_PULL_TIMEOUT_MS (30 minutes), getOllamaPullTimeoutMs() (parses NEMOCLAW_OLLAMA_PULL_TIMEOUT in seconds with fallback for unset / empty / non-numeric / non-positive inputs), and pullTimeoutErrorHint() (prints elapsed minutes, the resume-on-retry note, and the env var hint) in src/lib/onboard-ollama-proxy.ts.
Wire the helper into pullOllamaModelViaCli (replacing the hardcoded timeout: 600_000 on spawnSync) and pullOllamaModelViaHttp (replacing the hardcoded TIMEOUT_MS = 600_000 on the curl --max-time deadline). The HTTP-path non-timeout error path retains its existing wording.
Export getOllamaPullTimeoutMs so the unit tests can exercise the parser without shelling out to ollama pull.
Add test/ollama-pull-timeout.test.ts covering: default fallback when unset, empty, or whitespace; positive integer seconds → ms conversion; truncated fractional seconds; non-numeric fallback; zero/negative fallback.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

New Features
- Added configurable Ollama model-pull timeout via environment variable (30-minute default) with improved timeout error messaging.
Tests
- Added test suite validating timeout configuration, conversion, and error handling behavior.

The Ollama model pull was wrapped with a hardcoded 10-minute wall-clock timeout that fires regardless of progress, so any sustained download slower than ~8 MB/s could not complete the default 4.7 GB qwen2.5:7b pull. Each retry resumed from the last layer but still hit the same SIGTERM at the next 10-minute boundary. Replace the hardcoded value in both pullOllamaModelViaCli and pullOllamaModelViaHttp with a single getOllamaPullTimeoutMs() helper that reads NEMOCLAW_OLLAMA_PULL_TIMEOUT (in seconds) and falls back to a 30-minute default. Update the CLI and HTTP timeout error messages to mention the env var override and that already-downloaded layers are kept across retries. Add unit coverage for the helper's parsing behaviour: default fallback for unset/empty/non-numeric/non-positive inputs, and millisecond conversion for valid numeric inputs. Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

copy-pr-bot · 2026-05-05T12:33:38Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-05-05T12:33:43Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: acdfc06f-16a5-46da-87cb-33e13d7a369b

📥 Commits

Reviewing files that changed from the base of the PR and between b346375 and 96f88b0.

📒 Files selected for processing (2)

src/lib/onboard-ollama-proxy.ts
test/ollama-pull-timeout.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/lib/onboard-ollama-proxy.ts

📝 Walkthrough

Walkthrough

This change adds environment-variable configurability for Ollama model-pull timeouts via NEMOCLAW_OLLAMA_PULL_TIMEOUT. It replaces a hardcoded 10-minute wall-clock timeout with a 30-minute default (adjustable via env var), applies the timeout to both CLI and HTTP pull methods, and emits standardized timeout error messages that reference the effective timeout value.

Changes

Timeout Configurability & Error Messaging

Layer / File(s)	Summary
Config Parsing & Validation `src/lib/onboard-ollama-proxy.ts` (lines 309–328)	Introduces `DEFAULT_OLLAMA_PULL_TIMEOUT_MS` (30 min) and `getOllamaPullTimeoutMs()` to parse/validate `NEMOCLAW_OLLAMA_PULL_TIMEOUT`, returning milliseconds. Adds `pullTimeoutErrorHint(timeoutMs)` helper to produce consistent timeout error messages.
CLI Pull Integration `src/lib/onboard-ollama-proxy.ts` (lines 329–341)	`pullOllamaModelViaCli` computes timeout via `getOllamaPullTimeoutMs()` and passes it to `spawnSync` timeout option; on `SIGTERM`, prints `pullTimeoutErrorHint(timeoutMs)` instead of hardcoded "10 minutes" message.
HTTP Pull Integration `src/lib/onboard-ollama-proxy.ts` (lines 354–366, 472–476)	`pullOllamaModelViaHttp` calculates `TIMEOUT_MS` using `getOllamaPullTimeoutMs()` and sets curl's `--max-time` accordingly. Curl exit code `28` (timeout) emits `pullTimeoutErrorHint(TIMEOUT_MS)`; other errors retain existing messaging.
Export `src/lib/onboard-ollama-proxy.ts` (lines 570–573)	Exports `getOllamaPullTimeoutMs` for external use.
Comprehensive Test Suite `test/ollama-pull-timeout.test.ts`	Tests default behavior (30 min when env var unset/empty/whitespace), seconds-to-milliseconds conversion, fractional-second precision preservation, fallback for non-numeric/zero/negative inputs, and integration-style verification that fractional precision is preserved in curl's `--max-time` argument. Environment variable is restored after each test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A timeout so long—thirty minutes to roam,
Ollama pulls never far from home.
With env vars to tune, the rabbit rejoiced,
No more "Try a smaller!"—the user has choice.
Fractional seconds preserved, with care,
Downloads now finish. The network's fair! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly summarizes the main change: making the Ollama model pull timeout configurable instead of hardcoded.
Linked Issues check	✅ Passed	The PR implements all coding requirements from issue `#2610`: configurable timeout via NEMOCLAW_OLLAMA_PULL_TIMEOUT env var, 30-minute default, updated error messaging with resume hint and env-var guidance, and applies timeout to both CLI and HTTP pull paths.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#2610` requirements: timeout configurability, error messaging, and test coverage for the new functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/ollama-pull-timeout

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/onboard-ollama-proxy.ts (1)
354-366: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove Math.floor() to preserve timeout precision between HTTP and CLI paths.

The HTTP pull path converts timeout to seconds for curl via Math.floor(TIMEOUT_MS / 1000), which truncates fractional values and can produce 0 for sub-second timeouts (disabling the timeout entirely per curl docs). This creates cross-path behavior drift: the CLI path uses the full millisecond precision from getOllamaPullTimeoutMs(), while the HTTP path floors it.
Proposed fix
-    const TIMEOUT_MS = getOllamaPullTimeoutMs();
+    const TIMEOUT_MS = getOllamaPullTimeoutMs();
+    const timeoutSeconds = TIMEOUT_MS / 1000;
@@
-        String(Math.floor(TIMEOUT_MS / 1000)),
+        String(timeoutSeconds),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard-ollama-proxy.ts` around lines 354 - 366, The CLI curl spawn
is passing a floored seconds value to "--max-time" which truncates fractional
seconds and can become "0"; change the argument construction so it preserves
millisecond precision by removing Math.floor and passing the exact seconds
string (e.g. String(TIMEOUT_MS / 1000)) when building the spawn args for
"--max-time" (see TIMEOUT_MS, getOllamaPullTimeoutMs, and the spawn call with
the "--max-time" argument).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/onboard-ollama-proxy.ts`:
- Around line 354-366: The CLI curl spawn is passing a floored seconds value to
"--max-time" which truncates fractional seconds and can become "0"; change the
argument construction so it preserves millisecond precision by removing
Math.floor and passing the exact seconds string (e.g. String(TIMEOUT_MS / 1000))
when building the spawn args for "--max-time" (see TIMEOUT_MS,
getOllamaPullTimeoutMs, and the spawn call with the "--max-time" argument).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ed39b1ee-a051-48e4-a120-163c813e20e3

📥 Commits

Reviewing files that changed from the base of the PR and between 6c07b82 and b346375.

📒 Files selected for processing (2)

src/lib/onboard-ollama-proxy.ts
test/ollama-pull-timeout.test.ts

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa

Looks good. I verified this against current main, including the linked Ollama timeout issue and CodeRabbit's fractional-timeout concern.

The PR now keeps the configurable timeout shared across the CLI and HTTP pull paths, preserves sub-second precision for curl, and covers the parser plus HTTP-path wiring with focused tests. Local validation passed (

nemoclaw@0.1.0 build:cli
tsc -p tsconfig.src.json && tsc -p nemoclaw-blueprint/tsconfig.json, targeted Ollama timeout/proxy tests, and the Ollama onboarding selection slice), and the current PR head is green across the visible GitHub checks.

#3122) ## Summary #3116 reports that `NEMOCLAW_OLLAMA_PULL_TIMEOUT` was added in code by #3037 (`PULL_TIMEOUT_ENV` constant in `src/lib/onboard-ollama-proxy.ts`, gate inside `getOllamaPullTimeoutMs`) but was never added to `docs/reference/commands.md`. Users hitting the 30-minute default on slow links cannot discover how to raise it; the only mention is the in-flight error hint when the timeout actually fires. ## Problem `NEMOCLAW_OLLAMA_PULL_TIMEOUT` is read by `getOllamaPullTimeoutMs()` (defaults to 30 minutes / 1800 seconds, accepts integer or float seconds, validated for finite-positive). It controls the wall-clock cap on `ollama pull` during onboard. Without docs, users only learn the variable name from the timeout-error hint after the failure. ## Changes - Add a "Onboard timeouts" subsection to the existing Environment Variables section of `docs/reference/commands.md`. New table row documents the variable, default (`1800` / 30 minutes), accepted format (integer or float seconds), and the partial-download-resume behavior. One-line console example. - Mirror the same addition into `.agents/skills/nemoclaw-user-reference/references/commands.md` so the auto-generated skill stays aligned with the source. ## Type of Change - [x] Code change with doc updates - [x] Doc only (prose changes, no code sample modifications) ## Test plan - [x] No code change; no behavioral test added. - [x] Doc claims (default, format, behavior) match the implementation in `src/lib/onboard-ollama-proxy.ts:311-330` and the error-hint text at `pullTimeoutErrorHint`. - [x] Skill mirror addition is identical to the source addition (table row + console example + trailing paragraph). Closes #3116 --- Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>  ## Summary by CodeRabbit * **Documentation** * Added an "Onboard timeouts" section explaining timeout controls for model pulls during onboarding. * Introduced NEMOCLAW_OLLAMA_PULL_TIMEOUT (default: 1800 seconds). Notes: accepts integer or float values, reported in minutes on timeout, preserves already-downloaded layers so retries resume, and includes configuration examples and guidance for increasing the limit.  Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: cjagwani <cjagwani@nvidia.com> Co-authored-by: J. Yaunches <jyaunches@nvidia.com>

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

laitingsheng marked this pull request as ready for review May 5, 2026 12:53

laitingsheng added the v0.0.35 label May 5, 2026

fix(onboard): preserve fractional Ollama HTTP timeout

96f88b0

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

ericksoa assigned ericksoa and laitingsheng and unassigned ericksoa May 5, 2026

ericksoa self-requested a review May 5, 2026 14:09

ericksoa added bug Something fails against expected or documented behavior provider: ollama Ollama local model provider behavior labels May 5, 2026

ericksoa approved these changes May 5, 2026

View reviewed changes

ericksoa merged commit eed5c46 into main May 5, 2026
19 checks passed

ericksoa deleted the fix/ollama-pull-timeout branch May 5, 2026 14:16

latenighthackathon mentioned this pull request May 6, 2026

docs(reference): document NEMOCLAW_OLLAMA_PULL_TIMEOUT env var (#3116) #3122

Merged

5 tasks

wscurran mentioned this pull request May 6, 2026

[All Platforms][Docs] NEMOCLAW_OLLAMA_PULL_TIMEOUT env var not documented in commands-reference #3116

Closed

cjagwani mentioned this pull request May 7, 2026

ci: enforce documentation for all user-facing NEMOCLAW_* env vars #3184

Closed

wscurran added bug-fix PR fixes a bug or regression and removed bug Something fails against expected or documented behavior labels Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(onboard): make the Ollama model pull timeout configurable#3037

fix(onboard): make the Ollama model pull timeout configurable#3037
ericksoa merged 2 commits into
mainfrom
fix/ollama-pull-timeout

laitingsheng commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

copy-pr-bot Bot commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

ericksoa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

laitingsheng commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

copy-pr-bot Bot commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ericksoa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

laitingsheng commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading