feat(inference): add qwen3.6:35b to Ollama starter models on large-memory hosts#3422
Conversation
…mory hosts Signed-off-by: zyang-dev <267119621+zyang-dev@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughThis PR adds Qwen3.6 35B as a bootstrap Ollama model option for hosts with ≥32 GiB GPU memory. Changes include the model constant export, fallback size metadata, bootstrap selection and fallback logic, test updates, and user documentation. ChangesQwen3.6 Model Addition
Sequence DiagramsequenceDiagram
participant OnboardWizard
participant GPUDetector
participant LocalInference
OnboardWizard->>GPUDetector: query totalMemoryMB
GPUDetector-->>OnboardWizard: totalMemoryMB value
OnboardWizard->>LocalInference: request bootstrap models
LocalInference->>LocalInference: getBootstrapOllamaModelOptions(gpu)
LocalInference-->>OnboardWizard: model list (includes qwen3.6:35b when memory >= threshold)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
ericksoa
left a comment
There was a problem hiding this comment.
Approved after review and focused validation. Nightly E2E dispatched separately for post-merge signal.
Summary
Ollama starter-model list now includes qwen3.6:35b alongside qwen2.5:7b and nemotron-3-nano:30b.
Related Issue
Fixes #3250
Changes
src/lib/inference/local.ts:QWEN3_6_OLLAMA_MODEL = "qwen3.6:35b"constant.getBootstrapOllamaModelOptionsappends it to the options whengpu.totalMemoryMB >= LARGE_OLLAMA_MIN_MEMORY_MB, after the existing nemotron-3-nano:30b entry.src/lib/inference/ollama/model-size.ts:qwen3.6:35bto theFALLBACK_SIZE_BYTEStable at 24 GB so the memory probe has an estimate without hitting the manifest.src/lib/inference/local.test.ts:docs/inference/use-local-inference.md:qwen3.6:35bappears in the starter list on hosts with ≥32 GiB of detected GPU memory.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: zyang-dev 267119621+zyang-dev@users.noreply.github.com
Summary by CodeRabbit