Skip to content

feat(inference): add qwen3.6:35b to Ollama starter models on large-memory hosts#3422

Merged
ericksoa merged 3 commits into
mainfrom
fix/ollama-model-list
May 13, 2026
Merged

feat(inference): add qwen3.6:35b to Ollama starter models on large-memory hosts#3422
ericksoa merged 3 commits into
mainfrom
fix/ollama-model-list

Conversation

@zyang-dev

@zyang-dev zyang-dev commented May 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Ollama starter-model list now includes qwen3.6:35b alongside qwen2.5:7b and nemotron-3-nano:30b.

Related Issue

Fixes #3250

Changes

  • src/lib/inference/local.ts:
    • New QWEN3_6_OLLAMA_MODEL = "qwen3.6:35b" constant.
    • getBootstrapOllamaModelOptions appends it to the options when gpu.totalMemoryMB >= LARGE_OLLAMA_MIN_MEMORY_MB, after the existing nemotron-3-nano:30b entry.
  • src/lib/inference/ollama/model-size.ts:
    • Added qwen3.6:35b to the FALLBACK_SIZE_BYTES table at 24 GB so the memory probe has an estimate without hitting the manifest.
  • src/lib/inference/local.test.ts:
    • Updated the bootstrap-options test to expect three entries on large-memory hosts.
  • docs/inference/use-local-inference.md:
    • Added one sentence noting that qwen3.6:35b appears in the starter list on hosts with ≥32 GiB of detected GPU memory.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: zyang-dev 267119621+zyang-dev@users.noreply.github.com

Summary by CodeRabbit

  • New Features
    • Added Qwen 3.6 (35B) as a starter option for local inference on systems with ≥32 GiB detected GPU memory; becomes the default when no models are discovered on large-GPU hosts.
  • Documentation
    • Noted the Ollama onboarding flow update reflecting the new starter model.
  • Tests
    • Updated test expectations to include the Qwen 3.6 starter option.

Review Change Stack

…mory hosts

Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
@zyang-dev zyang-dev self-assigned this May 12, 2026
@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6f361d13-8e98-49b3-b0ee-867c5b1cb2dc

📥 Commits

Reviewing files that changed from the base of the PR and between 5e220f7 and 947f907.

📒 Files selected for processing (3)
  • docs/inference/use-local-inference.md
  • src/lib/inference/local.test.ts
  • src/lib/inference/local.ts
✅ Files skipped from review due to trivial changes (1)
  • docs/inference/use-local-inference.md

📝 Walkthrough

Walkthrough

This PR adds Qwen3.6 35B as a bootstrap Ollama model option for hosts with ≥32 GiB GPU memory. Changes include the model constant export, fallback size metadata, bootstrap selection and fallback logic, test updates, and user documentation.

Changes

Qwen3.6 Model Addition

Layer / File(s) Summary
Model constant and size metadata
src/lib/inference/local.ts, src/lib/inference/ollama/model-size.ts
New exported constant QWEN3_6_OLLAMA_MODEL = "qwen3.6:35b" and corresponding fallback byte-size entry for registry probing fallback.
Bootstrap integration and fallback logic + tests
src/lib/inference/local.ts, src/lib/inference/local.test.ts
getBootstrapOllamaModelOptions pushes QWEN3_6_OLLAMA_MODEL; getDefaultOllamaModel fallback now returns QWEN3_6_OLLAMA_MODEL for large-memory GPUs and SMALL_OLLAMA_MODEL otherwise. Tests import the new constant and update expectations for the LARGE_OLLAMA_MIN_MEMORY_MB case.
User-facing documentation
docs/inference/use-local-inference.md
Onboarding docs updated to state that with ≥32 GiB detected GPU memory, the starter model list includes qwen3.6:35b alongside smaller defaults.

Sequence Diagram

sequenceDiagram
  participant OnboardWizard
  participant GPUDetector
  participant LocalInference
  OnboardWizard->>GPUDetector: query totalMemoryMB
  GPUDetector-->>OnboardWizard: totalMemoryMB value
  OnboardWizard->>LocalInference: request bootstrap models
  LocalInference->>LocalInference: getBootstrapOllamaModelOptions(gpu)
  LocalInference-->>OnboardWizard: model list (includes qwen3.6:35b when memory >= threshold)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit scurries to the gate,
Qwen3.6 hops in—no need to wait,
Thirty-five billion, now in view,
For GPU hosts with memory true. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and accurately summarizes the main change: adding qwen3.6:35b to Ollama starter models specifically for large-memory hosts.
Linked Issues check ✅ Passed All coding requirements from issue #3250 are met: qwen3.6:35b is added to the starter model list, memory estimation is provided, tests are updated, and documentation is updated.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #3250: adding qwen3.6:35b to the Ollama starter models on large-memory hosts. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ollama-model-list

Comment @coderabbitai help to get the list of available commands and usage tips.

@zyang-dev zyang-dev added v0.0.40 and removed v0.0.40 labels May 12, 2026

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after review and focused validation. Nightly E2E dispatched separately for post-merge signal.

@ericksoa ericksoa merged commit 800d924 into main May 13, 2026
71 of 72 checks passed
@ericksoa ericksoa deleted the fix/ollama-model-list branch May 13, 2026 01:26
@wscurran wscurran added the feature PR adds or expands user-visible functionality label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PR adds or expands user-visible functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DGX Spark][Onboard] POR default model Qwen3.6 35B missing from Ollama starter model list

3 participants