Skip to content

fix(onboard): make the Ollama model pull timeout configurable#3037

Merged
ericksoa merged 2 commits into
mainfrom
fix/ollama-pull-timeout
May 5, 2026
Merged

fix(onboard): make the Ollama model pull timeout configurable#3037
ericksoa merged 2 commits into
mainfrom
fix/ollama-pull-timeout

Conversation

@laitingsheng

@laitingsheng laitingsheng commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

The Ollama model pull was wrapped with a hardcoded 10-minute wall-clock timeout in both pull paths, so sustained network throughput below ~8 MB/s could not complete the default 4.7 GB qwen2.5:7b pull (each retry resumed from the last layer but still hit SIGTERM at the next 10-minute boundary). Replace the hardcoded value with a NEMOCLAW_OLLAMA_PULL_TIMEOUT env-var override (seconds) defaulting to 30 minutes, and surface the override hint in the timeout error messages.

Related Issue

Closes #2610

Changes

  • Add DEFAULT_OLLAMA_PULL_TIMEOUT_MS (30 minutes), getOllamaPullTimeoutMs() (parses NEMOCLAW_OLLAMA_PULL_TIMEOUT in seconds with fallback for unset / empty / non-numeric / non-positive inputs), and pullTimeoutErrorHint() (prints elapsed minutes, the resume-on-retry note, and the env var hint) in src/lib/onboard-ollama-proxy.ts.
  • Wire the helper into pullOllamaModelViaCli (replacing the hardcoded timeout: 600_000 on spawnSync) and pullOllamaModelViaHttp (replacing the hardcoded TIMEOUT_MS = 600_000 on the curl --max-time deadline). The HTTP-path non-timeout error path retains its existing wording.
  • Export getOllamaPullTimeoutMs so the unit tests can exercise the parser without shelling out to ollama pull.
  • Add test/ollama-pull-timeout.test.ts covering: default fallback when unset, empty, or whitespace; positive integer seconds → ms conversion; truncated fractional seconds; non-numeric fallback; zero/negative fallback.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • New Features

    • Added configurable Ollama model-pull timeout via environment variable (30-minute default) with improved timeout error messaging.
  • Tests

    • Added test suite validating timeout configuration, conversion, and error handling behavior.

The Ollama model pull was wrapped with a hardcoded 10-minute wall-clock timeout that fires regardless of progress, so any sustained download slower than ~8 MB/s could not complete the default 4.7 GB qwen2.5:7b pull. Each retry resumed from the last layer but still hit the same SIGTERM at the next 10-minute boundary.

Replace the hardcoded value in both pullOllamaModelViaCli and pullOllamaModelViaHttp with a single getOllamaPullTimeoutMs() helper that reads NEMOCLAW_OLLAMA_PULL_TIMEOUT (in seconds) and falls back to a 30-minute default. Update the CLI and HTTP timeout error messages to mention the env var override and that already-downloaded layers are kept across retries. Add unit coverage for the helper's parsing behaviour: default fallback for unset/empty/non-numeric/non-positive inputs, and millisecond conversion for valid numeric inputs.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented May 5, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: acdfc06f-16a5-46da-87cb-33e13d7a369b

📥 Commits

Reviewing files that changed from the base of the PR and between b346375 and 96f88b0.

📒 Files selected for processing (2)
  • src/lib/onboard-ollama-proxy.ts
  • test/ollama-pull-timeout.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard-ollama-proxy.ts

📝 Walkthrough

Walkthrough

This change adds environment-variable configurability for Ollama model-pull timeouts via NEMOCLAW_OLLAMA_PULL_TIMEOUT. It replaces a hardcoded 10-minute wall-clock timeout with a 30-minute default (adjustable via env var), applies the timeout to both CLI and HTTP pull methods, and emits standardized timeout error messages that reference the effective timeout value.

Changes

Timeout Configurability & Error Messaging

Layer / File(s) Summary
Config Parsing & Validation
src/lib/onboard-ollama-proxy.ts (lines 309–328)
Introduces DEFAULT_OLLAMA_PULL_TIMEOUT_MS (30 min) and getOllamaPullTimeoutMs() to parse/validate NEMOCLAW_OLLAMA_PULL_TIMEOUT, returning milliseconds. Adds pullTimeoutErrorHint(timeoutMs) helper to produce consistent timeout error messages.
CLI Pull Integration
src/lib/onboard-ollama-proxy.ts (lines 329–341)
pullOllamaModelViaCli computes timeout via getOllamaPullTimeoutMs() and passes it to spawnSync timeout option; on SIGTERM, prints pullTimeoutErrorHint(timeoutMs) instead of hardcoded "10 minutes" message.
HTTP Pull Integration
src/lib/onboard-ollama-proxy.ts (lines 354–366, 472–476)
pullOllamaModelViaHttp calculates TIMEOUT_MS using getOllamaPullTimeoutMs() and sets curl's --max-time accordingly. Curl exit code 28 (timeout) emits pullTimeoutErrorHint(TIMEOUT_MS); other errors retain existing messaging.
Export
src/lib/onboard-ollama-proxy.ts (lines 570–573)
Exports getOllamaPullTimeoutMs for external use.
Comprehensive Test Suite
test/ollama-pull-timeout.test.ts
Tests default behavior (30 min when env var unset/empty/whitespace), seconds-to-milliseconds conversion, fractional-second precision preservation, fallback for non-numeric/zero/negative inputs, and integration-style verification that fractional precision is preserved in curl's --max-time argument. Environment variable is restored after each test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A timeout so long—thirty minutes to roam,
Ollama pulls never far from home.
With env vars to tune, the rabbit rejoiced,
No more "Try a smaller!"—the user has choice.
Fractional seconds preserved, with care,
Downloads now finish. The network's fair! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly summarizes the main change: making the Ollama model pull timeout configurable instead of hardcoded.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #2610: configurable timeout via NEMOCLAW_OLLAMA_PULL_TIMEOUT env var, 30-minute default, updated error messaging with resume hint and env-var guidance, and applies timeout to both CLI and HTTP pull paths.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #2610 requirements: timeout configurability, error messaging, and test coverage for the new functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ollama-pull-timeout

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/onboard-ollama-proxy.ts (1)

354-366: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove Math.floor() to preserve timeout precision between HTTP and CLI paths.

The HTTP pull path converts timeout to seconds for curl via Math.floor(TIMEOUT_MS / 1000), which truncates fractional values and can produce 0 for sub-second timeouts (disabling the timeout entirely per curl docs). This creates cross-path behavior drift: the CLI path uses the full millisecond precision from getOllamaPullTimeoutMs(), while the HTTP path floors it.

Proposed fix
-    const TIMEOUT_MS = getOllamaPullTimeoutMs();
+    const TIMEOUT_MS = getOllamaPullTimeoutMs();
+    const timeoutSeconds = TIMEOUT_MS / 1000;
@@
-        String(Math.floor(TIMEOUT_MS / 1000)),
+        String(timeoutSeconds),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/onboard-ollama-proxy.ts` around lines 354 - 366, The CLI curl spawn
is passing a floored seconds value to "--max-time" which truncates fractional
seconds and can become "0"; change the argument construction so it preserves
millisecond precision by removing Math.floor and passing the exact seconds
string (e.g. String(TIMEOUT_MS / 1000)) when building the spawn args for
"--max-time" (see TIMEOUT_MS, getOllamaPullTimeoutMs, and the spawn call with
the "--max-time" argument).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/onboard-ollama-proxy.ts`:
- Around line 354-366: The CLI curl spawn is passing a floored seconds value to
"--max-time" which truncates fractional seconds and can become "0"; change the
argument construction so it preserves millisecond precision by removing
Math.floor and passing the exact seconds string (e.g. String(TIMEOUT_MS / 1000))
when building the spawn args for "--max-time" (see TIMEOUT_MS,
getOllamaPullTimeoutMs, and the spawn call with the "--max-time" argument).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ed39b1ee-a051-48e4-a120-163c813e20e3

📥 Commits

Reviewing files that changed from the base of the PR and between 6c07b82 and b346375.

📒 Files selected for processing (2)
  • src/lib/onboard-ollama-proxy.ts
  • test/ollama-pull-timeout.test.ts

@laitingsheng laitingsheng marked this pull request as ready for review May 5, 2026 12:53
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa assigned ericksoa and laitingsheng and unassigned ericksoa May 5, 2026
@ericksoa ericksoa self-requested a review May 5, 2026 14:09
@ericksoa ericksoa added bug Something fails against expected or documented behavior provider: ollama Ollama local model provider behavior labels May 5, 2026

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I verified this against current main, including the linked Ollama timeout issue and CodeRabbit's fractional-timeout concern.

The PR now keeps the configurable timeout shared across the CLI and HTTP pull paths, preserves sub-second precision for curl, and covers the parser plus HTTP-path wiring with focused tests. Local validation passed (

nemoclaw@0.1.0 build:cli
tsc -p tsconfig.src.json && tsc -p nemoclaw-blueprint/tsconfig.json, targeted Ollama timeout/proxy tests, and the Ollama onboarding selection slice), and the current PR head is green across the visible GitHub checks.

@ericksoa ericksoa merged commit eed5c46 into main May 5, 2026
19 checks passed
@ericksoa ericksoa deleted the fix/ollama-pull-timeout branch May 5, 2026 14:16
ericksoa pushed a commit that referenced this pull request May 8, 2026
#3122)

## Summary

#3116 reports that `NEMOCLAW_OLLAMA_PULL_TIMEOUT` was added in code by
#3037 (`PULL_TIMEOUT_ENV` constant in `src/lib/onboard-ollama-proxy.ts`,
gate inside `getOllamaPullTimeoutMs`) but was never added to
`docs/reference/commands.md`. Users hitting the 30-minute default on
slow links cannot discover how to raise it; the only mention is the
in-flight error hint when the timeout actually fires.

## Problem

`NEMOCLAW_OLLAMA_PULL_TIMEOUT` is read by `getOllamaPullTimeoutMs()`
(defaults to 30 minutes / 1800 seconds, accepts integer or float
seconds, validated for finite-positive). It controls the wall-clock cap
on `ollama pull` during onboard. Without docs, users only learn the
variable name from the timeout-error hint after the failure.

## Changes

- Add a "Onboard timeouts" subsection to the existing Environment
Variables section of `docs/reference/commands.md`. New table row
documents the variable, default (`1800` / 30 minutes), accepted format
(integer or float seconds), and the partial-download-resume behavior.
One-line console example.
- Mirror the same addition into
`.agents/skills/nemoclaw-user-reference/references/commands.md` so the
auto-generated skill stays aligned with the source.

## Type of Change

- [x] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)

## Test plan

- [x] No code change; no behavioral test added.
- [x] Doc claims (default, format, behavior) match the implementation in
`src/lib/onboard-ollama-proxy.ts:311-330` and the error-hint text at
`pullTimeoutErrorHint`.
- [x] Skill mirror addition is identical to the source addition (table
row + console example + trailing paragraph).

Closes #3116

---
Signed-off-by: latenighthackathon
<latenighthackathon@users.noreply.github.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Added an "Onboard timeouts" section explaining timeout controls for
model pulls during onboarding.
* Introduced NEMOCLAW_OLLAMA_PULL_TIMEOUT (default: 1800 seconds).
Notes: accepts integer or float values, reported in minutes on timeout,
preserves already-downloaded layers so retries resume, and includes
configuration examples and guidance for increasing the limit.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Co-authored-by: cjagwani <cjagwani@nvidia.com>
Co-authored-by: J. Yaunches <jyaunches@nvidia.com>
@wscurran wscurran added bug-fix PR fixes a bug or regression and removed bug Something fails against expected or documented behavior labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression provider: ollama Ollama local model provider behavior

Projects

None yet

3 participants