Collapse the Open Source catalog to four named tiers#514
Merged
Conversation
Rename the base-model lineup to a single nano/mini/base/pro ladder and drop the parallel gemma-* sub-brand plus the 4B Qwen tier, so a user sees one coherent set of four models instead of five names spread across two families. tabby-2-nano = Qwen3.5-0.8B-Base.i1-Q6_K (was tabby-2-mini); tabby-2-mini = Qwen3.5-2B-Base.i1-Q4_K_M (was tabby-2-base, still the default load); tabby-2-base = gemma-4-E2B.i1-Q6_K (was tabby-2-gemma-mini); tabby-2-pro = gemma-4-E4B.i1-Q4_K_M (was tabby-2-gemma-pro). Qwen3.5-4B-Base is retired from the catalog; an already-installed copy keeps working and falls back to its raw filename like any user-supplied GGUF. Onboarding's Powerful tier now installs gemma-4-E4B (the new pro) since it previously pointed at the dropped 4B Qwen. Display names are derived from the filename, so this is a relabel: no re-download and no change to a user's persisted model selection.
Per the requested lineup: the base tier (gemma-4-E2B) is now the default load (first in preferredModelNames) and what onboarding's Everyday/Recommended tier installs. Quick stays on nano (Qwen 0.8B) and Powerful on pro (gemma-4-E4B), so the onboarding labels read nano / base / pro.
Comment on lines
159
to
166
| static let `default` = LlamaRuntimeConfiguration( | ||
| runtimeDirectoryPath: nil, | ||
| preferredModelNames: [ | ||
| "gemma-4-E2B.i1-Q6_K.gguf", | ||
| "Qwen3.5-2B-Base.i1-Q4_K_M.gguf", | ||
| "Qwen3.5-0.8B-Base.i1-Q6_K.gguf", | ||
| "Qwen3.5-4B-Base.i1-Q4_K_M.gguf", | ||
| "gemma-4-E2B.i1-Q6_K.gguf", | ||
| "gemma-4-E4B.i1-Q4_K_M.gguf" | ||
| ], |
Contributor
There was a problem hiding this comment.
preferredModelNames promotes Gemma E2B to first despite PR description explicitly deferring that
The PR description's "Risk / rollout notes" says "Making base the default is a one-line change but raises the default download from ~1.4 GB to ~4.5 GB, so it is intentionally left out of this PR." Yet this diff moves gemma-4-E2B.i1-Q6_K.gguf to position 0 in preferredModelNames, and the README now says "tabby-2-base (Gemma E2B) is the default." The locator picks the first GGUF that exists, so any user who has both Qwen 2B and Gemma E2B installed will silently auto-load Gemma E2B (4.5 GB) instead of Qwen 2B (1.4 GB) after this update — a direct contradiction of "no change to anyone's selected model."
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Collapses the Open Source model catalog to a single, coherent four-tier ladder so users stop
seeing five names spread across two families (
tabby-2-mini/base/proplus a paralleltabby-2-gemma-mini/-prosub-brand). The new lineup isnano / mini / base / pro:tabby-2-nanoQwen3.5-0.8B-Base.i1-Q6_K.gguftabby-2-minitabby-2-miniQwen3.5-2B-Base.i1-Q4_K_M.gguftabby-2-basetabby-2-basegemma-4-E2B.i1-Q6_K.gguftabby-2-gemma-mini(now the default)tabby-2-progemma-4-E4B.i1-Q4_K_M.gguftabby-2-gemma-proQwen3.5-4B-Base.i1-Q4_K_M.gguftabby-2-proDisplay names are derived from the GGUF filename (the persisted state is the filename), so the
relabel needs no re-download and changes no one's selected model. All four download URLs were
verified to resolve (HTTP 200) before committing.
Onboarding now reads
nano(Quick) /base(Everyday, Recommended) /pro(Powerful), andtabby-2-base(gemma-4-E2B) is the default load.Validation
App-hosted tests run unsigned (the default-signed bundle hits a Team ID mismatch on this machine).
Linked issues
None filed. Follow-up to the base-model migration; addresses recurring confusion from the
five-names-two-families catalog.
Risk / rollout notes
displayNameis derived from the filename and the persisted stateis the filename, so an existing user's installed model keeps working and just shows its new label.
Qwen3.5-4B-Baseretired. Removed from the download list and preferred-load order. Analready-installed copy still loads and falls back to its raw filename (existing retired-model
behavior, covered by a test).
tabby-2-base(gemma-4-E2B, ~4.5 GB). This makes the recommendedonboarding download larger than before (~1.4 GB Qwen 2B previously). Intentional per the requested
lineup. Quick stays nano (~0.8 GB); Powerful is pro (gemma-4-E4B, ~5.0 GB).
baseis both a tier name and the technical term for thesenon-instruct models. Kept per the requested lineup.
Greptile Summary
This PR collapses the five-model Open Source catalog into a clean four-tier ladder (
nano / mini / base / pro), retiring thetabby-2-gemma-*sub-brand and the dropped Qwen 4B, and rewires onboarding templates to the new filenames.displayNameswitch inLlamaRuntimeModels.swiftupdated to the new four-tier names; Qwen 4B removed fromdownloadableModels; display-name and retired-model fallback tests extended to cover all four active tiers..everydaynow installsgemma-4-E2B(4.5 GB, up from 1.4 GB) and.powerfulnow installsgemma-4-E4B(5.0 GB, previously the broken dropped-4B path); test assertions updated accordingly.gemma-4-E2B.i1-Q6_K.ggufmoved to position 0, making it the auto-loaded model for any user who has it installed — despite the PR description's explicit note that this change was intentionally deferred.Confidence Score: 4/5
Safe to merge for new users and single-model installs; users with both Qwen 2B and Gemma E2B installed will silently switch to the larger model on next launch.
The PR description explicitly deferred promoting Gemma E2B to the default load due to the 1.4 GB to 4.5 GB size jump, yet preferredModelNames and the README both implement that promotion. Any user who had the old tabby-2-gemma-mini installed alongside the old tabby-2-base will experience a silent auto-load change.
Cotabby/Models/LlamaRuntimeModels.swift and README.md need to be reconciled — either the preferredModelNames order should revert to keep Qwen 2B first, or the PR description should be updated to acknowledge this as an intentional change.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD subgraph Old["Old catalog (5 models)"] O1["tabby-2-mini\nQwen3.5-0.8B (~0.8 GB)"] O2["tabby-2-base\nQwen3.5-2B (~1.4 GB)"] O3["tabby-2-pro\nQwen3.5-4B (~2.6 GB) - DROPPED"] O4["tabby-2-gemma-mini\ngemma-4-E2B (~4.5 GB)"] O5["tabby-2-gemma-pro\ngemma-4-E4B (~5.0 GB)"] end subgraph New["New catalog (4 models)"] N1["tabby-2-nano\nQwen3.5-0.8B (~0.8 GB)"] N2["tabby-2-mini\nQwen3.5-2B (~1.4 GB)"] N3["tabby-2-base\ngemma-4-E2B (~4.5 GB)"] N4["tabby-2-pro\ngemma-4-E4B (~5.0 GB)"] end subgraph Onboarding["Onboarding tiers to GGUF"] QK["Quick"] --> N1 EV["Everyday 1.4GB to 4.5GB"] --> N3 PW["Powerful"] --> N4 end O1 --> N1 O2 --> N2 O3 -. dropped .-> N3 O4 --> N3 O5 --> N4Reviews (2): Last reviewed commit: "Make tabby-2-base the default and the Ev..." | Re-trigger Greptile