Note: Please do not remove the questions. Answer beside them.
- Did you update?
pip install --upgrade unsloth unsloth_zoo = yes
Colab or Kaggle or local / cloud = local
- Number GPUs used, use
nvidia-smi = 1
- Which notebook? Please link! = ?
- Which Unsloth version, TRL version, transformers version, PyTorch version? = 2026.5.2
- Which trainer?
SFTTrainer, GRPOTrainer etc = NA
Summary
When loading a local GGUF model in Unsloth Studio from a directory containing multiple unrelated GGUF models, Studio may automatically attach an unrelated mmproj file from the same directory.
In my case, I selected a Qwen3.5 GGUF, but Studio launched llama-server with a Gemma mmproj:
/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \
-m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \
--port 51311 \
-c 254976 \
--parallel 1 \
--flash-attn on \
--no-context-shift \
-ngl -1 \
--threads -1 \
--jinja \
--chat-template-kwargs '{"enable_thinking": true}' \
--mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf
This causes the model load to fail because the projector does not match the selected model family.
Environment
- Running Unsloth Studio in Docker Desktop on Windows
- Image:
unsloth/unsloth:latest
- Host model directory bind-mounted into container:
volumes:
- type: bind
source: E:\llm_models
target: /workspace/llm-models
- Relevant backend file in container:
/opt/venv/lib/python3.12/site-packages/studio/backend/core/inference/llama_cpp.py
Local GGUF directory layout
My model folder is intentionally a flat local model library:
E:\llm_models
├── Qwen3.5-27B-IQ4_XS.gguf
├── Qwen3.5-27B-Q3_K_M.gguf
├── Qwen3.5-27B-Q4_K_M.gguf
├── Qwen3.5-27B-UD-IQ2_M.gguf
├── Qwen3.5-27B-UD-IQ3_XXS.gguf
├── Qwen3.5-35B-A3B-BF16-mmproj.gguf
├── Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
├── Qwen3.5-9B-BF16-mmproj.gguf
├── Qwen3.5-9B-Q4_K_M.gguf
└── gemma-4-26B-A4B-it.mmproj-q8_0.gguf
The selected model was:
/workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf
The correct matching projector should be:
/workspace/llm-models/Qwen3.5-9B-BF16-mmproj.gguf
But Studio selected:
/workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf
Expected behavior
When loading a local GGUF file, Studio should only attach an mmproj when it can confidently match the projector to the selected model.
For example:
Qwen3.5-9B-Q4_K_M.gguf
→ Qwen3.5-9B-BF16-mmproj.gguf
and:
Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
→ Qwen3.5-35B-A3B-BF16-mmproj.gguf
Studio should not attach:
gemma-4-26B-A4B-it.mmproj-q8_0.gguf
to any Qwen model.
If no matching projector can be found, Studio should load the GGUF without --mmproj.
Actual behavior
Studio appended the unrelated Gemma projector to the Qwen model launch command:
--mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf
The model failed to load.
Running the same llama-server command manually without the Gemma --mmproj argument succeeds:
/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \
-m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \
--host 0.0.0.0 \
--port 51311 \
-c 8192 \
--parallel 1 \
--flash-attn on \
--no-context-shift \
-ngl -1 \
--threads -1 \
--jinja \
--chat-template-kwargs '{"enable_thinking": true}'
Likely cause
The local GGUF discovery / launcher path appears to scan the selected model’s parent directory and chooses an available *mmproj*.gguf without sufficiently validating that it belongs to the selected base model.
The current launcher block appears to only check that mmproj_path exists:
if mmproj_path:
if not Path(mmproj_path).is_file():
logger.warning(f"mmproj file not found: {mmproj_path}")
else:
cmd.extend(["--mmproj", mmproj_path])
logger.info(f"Using mmproj for vision: {mmproj_path}")
That allows an unrelated sibling projector in a flat local model directory to be attached to the wrong model.
Suggested fix
For local GGUF files, Studio should resolve mmproj more conservatively:
- If the selected model is a loose local
.gguf file, search only for projector files that match the selected model’s base identity.
- Prefer projectors that share the same model family and size, for example:
Qwen3.5-9B-Q4_K_M.gguf
Qwen3.5-9B-BF16-mmproj.gguf
- Do not attach a projector if the model family conflicts:
- block
qwen model + gemma mmproj
- block
gemma model + qwen mmproj
- If no confident match is found, load the model without
--mmproj.
- Ideally expose the selected
mmproj in the UI and allow clearing or overriding it.
A basic family guard would already prevent the bad case:
model_name = Path(str(model_path)).name.lower()
mmproj_name = Path(str(mmproj_path)).name.lower()
families = (
"qwen",
"gemma",
"llama",
"mistral",
"glm",
"internvl",
"deepseek",
"phi",
"minicpm",
"llava",
)
model_family = next((f for f in families if f in model_name), None)
mmproj_family = next((f for f in families if f in mmproj_name), None)
if model_family and mmproj_family and model_family != mmproj_family:
logger.warning(
f"Skipping mismatched mmproj: model={model_name}, mmproj={mmproj_name}"
)
else:
cmd.extend(["--mmproj", mmproj_path])
logger.info(f"Using mmproj for vision: {mmproj_path}")
A better fix would actively search for a matching local projector. For my directory, the intended mapping is:
Qwen3.5-9B-Q4_K_M.gguf
→ Qwen3.5-9B-BF16-mmproj.gguf
and:
Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
→ Qwen3.5-35B-A3B-BF16-mmproj.gguf
Workaround used locally
I patched studio/backend/core/inference/llama_cpp.py so that it:
- detects the selected model family,
- rejects an
mmproj with a conflicting family,
- searches the selected model’s directory for a better matching local projector.
After that patch:
loads with:
Qwen3.5-9B-BF16-mmproj.gguf
instead of the unrelated Gemma projector.
Why this matters
Many users keep GGUF files in a single local model library rather than one Hugging Face-style directory per model. In that layout, Studio should not assume that any sibling mmproj belongs to the selected GGUF. Incorrect projector selection causes confusing load failures and makes otherwise valid local GGUF models appear broken.
llama_cpp_mmproj_matcher.py
Note: Please do not remove the questions. Answer beside them.
pip install --upgrade unsloth unsloth_zoo= yesColaborKaggleor local / cloud = localnvidia-smi= 1SFTTrainer,GRPOTraineretc = NASummary
When loading a local GGUF model in Unsloth Studio from a directory containing multiple unrelated GGUF models, Studio may automatically attach an unrelated
mmprojfile from the same directory.In my case, I selected a Qwen3.5 GGUF, but Studio launched
llama-serverwith a Gemma mmproj:/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \ -m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \ --port 51311 \ -c 254976 \ --parallel 1 \ --flash-attn on \ --no-context-shift \ -ngl -1 \ --threads -1 \ --jinja \ --chat-template-kwargs '{"enable_thinking": true}' \ --mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.ggufThis causes the model load to fail because the projector does not match the selected model family.
Environment
unsloth/unsloth:latestLocal GGUF directory layout
My model folder is intentionally a flat local model library:
The selected model was:
The correct matching projector should be:
But Studio selected:
Expected behavior
When loading a local GGUF file, Studio should only attach an
mmprojwhen it can confidently match the projector to the selected model.For example:
and:
Studio should not attach:
to any Qwen model.
If no matching projector can be found, Studio should load the GGUF without
--mmproj.Actual behavior
Studio appended the unrelated Gemma projector to the Qwen model launch command:
The model failed to load.
Running the same
llama-servercommand manually without the Gemma--mmprojargument succeeds:/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \ -m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \ --host 0.0.0.0 \ --port 51311 \ -c 8192 \ --parallel 1 \ --flash-attn on \ --no-context-shift \ -ngl -1 \ --threads -1 \ --jinja \ --chat-template-kwargs '{"enable_thinking": true}'Likely cause
The local GGUF discovery / launcher path appears to scan the selected model’s parent directory and chooses an available
*mmproj*.ggufwithout sufficiently validating that it belongs to the selected base model.The current launcher block appears to only check that
mmproj_pathexists:That allows an unrelated sibling projector in a flat local model directory to be attached to the wrong model.
Suggested fix
For local GGUF files, Studio should resolve
mmprojmore conservatively:.gguffile, search only for projector files that match the selected model’s base identity.Qwen3.5-9B-Q4_K_M.ggufQwen3.5-9B-BF16-mmproj.ggufqwenmodel +gemmammprojgemmamodel +qwenmmproj--mmproj.mmprojin the UI and allow clearing or overriding it.A basic family guard would already prevent the bad case:
A better fix would actively search for a matching local projector. For my directory, the intended mapping is:
and:
Workaround used locally
I patched
studio/backend/core/inference/llama_cpp.pyso that it:mmprojwith a conflicting family,After that patch:
loads with:
instead of the unrelated Gemma projector.
Why this matters
Many users keep GGUF files in a single local model library rather than one Hugging Face-style directory per model. In that layout, Studio should not assume that any sibling
mmprojbelongs to the selected GGUF. Incorrect projector selection causes confusing load failures and makes otherwise valid local GGUF models appear broken.llama_cpp_mmproj_matcher.py