Skip to content

Unsloth Studio incorrectly auto-selects unrelated mmproj from flat local GGUF directory #5347

@loopy321

Description

@loopy321

Note: Please do not remove the questions. Answer beside them.

  1. Did you update? pip install --upgrade unsloth unsloth_zoo = yes
  2. Colab or Kaggle or local / cloud = local
  3. Number GPUs used, use nvidia-smi = 1
  4. Which notebook? Please link! = ?
  5. Which Unsloth version, TRL version, transformers version, PyTorch version? = 2026.5.2
  6. Which trainer? SFTTrainer, GRPOTrainer etc = NA

Summary

When loading a local GGUF model in Unsloth Studio from a directory containing multiple unrelated GGUF models, Studio may automatically attach an unrelated mmproj file from the same directory.

In my case, I selected a Qwen3.5 GGUF, but Studio launched llama-server with a Gemma mmproj:

/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \
  -m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \
  --port 51311 \
  -c 254976 \
  --parallel 1 \
  --flash-attn on \
  --no-context-shift \
  -ngl -1 \
  --threads -1 \
  --jinja \
  --chat-template-kwargs '{"enable_thinking": true}' \
  --mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf

This causes the model load to fail because the projector does not match the selected model family.

Environment

  • Running Unsloth Studio in Docker Desktop on Windows
  • Image: unsloth/unsloth:latest
  • Host model directory bind-mounted into container:
volumes:
  - type: bind
    source: E:\llm_models
    target: /workspace/llm-models
  • Relevant backend file in container:
/opt/venv/lib/python3.12/site-packages/studio/backend/core/inference/llama_cpp.py

Local GGUF directory layout

My model folder is intentionally a flat local model library:

E:\llm_models
├── Qwen3.5-27B-IQ4_XS.gguf
├── Qwen3.5-27B-Q3_K_M.gguf
├── Qwen3.5-27B-Q4_K_M.gguf
├── Qwen3.5-27B-UD-IQ2_M.gguf
├── Qwen3.5-27B-UD-IQ3_XXS.gguf
├── Qwen3.5-35B-A3B-BF16-mmproj.gguf
├── Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
├── Qwen3.5-9B-BF16-mmproj.gguf
├── Qwen3.5-9B-Q4_K_M.gguf
└── gemma-4-26B-A4B-it.mmproj-q8_0.gguf

The selected model was:

/workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf

The correct matching projector should be:

/workspace/llm-models/Qwen3.5-9B-BF16-mmproj.gguf

But Studio selected:

/workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf

Expected behavior

When loading a local GGUF file, Studio should only attach an mmproj when it can confidently match the projector to the selected model.

For example:

Qwen3.5-9B-Q4_K_M.gguf
→ Qwen3.5-9B-BF16-mmproj.gguf

and:

Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
→ Qwen3.5-35B-A3B-BF16-mmproj.gguf

Studio should not attach:

gemma-4-26B-A4B-it.mmproj-q8_0.gguf

to any Qwen model.

If no matching projector can be found, Studio should load the GGUF without --mmproj.

Actual behavior

Studio appended the unrelated Gemma projector to the Qwen model launch command:

--mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf

The model failed to load.

Running the same llama-server command manually without the Gemma --mmproj argument succeeds:

/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \
  -m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 51311 \
  -c 8192 \
  --parallel 1 \
  --flash-attn on \
  --no-context-shift \
  -ngl -1 \
  --threads -1 \
  --jinja \
  --chat-template-kwargs '{"enable_thinking": true}'

Likely cause

The local GGUF discovery / launcher path appears to scan the selected model’s parent directory and chooses an available *mmproj*.gguf without sufficiently validating that it belongs to the selected base model.

The current launcher block appears to only check that mmproj_path exists:

if mmproj_path:
    if not Path(mmproj_path).is_file():
        logger.warning(f"mmproj file not found: {mmproj_path}")
    else:
        cmd.extend(["--mmproj", mmproj_path])
        logger.info(f"Using mmproj for vision: {mmproj_path}")

That allows an unrelated sibling projector in a flat local model directory to be attached to the wrong model.

Suggested fix

For local GGUF files, Studio should resolve mmproj more conservatively:

  1. If the selected model is a loose local .gguf file, search only for projector files that match the selected model’s base identity.
  2. Prefer projectors that share the same model family and size, for example:
    • Qwen3.5-9B-Q4_K_M.gguf
    • Qwen3.5-9B-BF16-mmproj.gguf
  3. Do not attach a projector if the model family conflicts:
    • block qwen model + gemma mmproj
    • block gemma model + qwen mmproj
  4. If no confident match is found, load the model without --mmproj.
  5. Ideally expose the selected mmproj in the UI and allow clearing or overriding it.

A basic family guard would already prevent the bad case:

model_name = Path(str(model_path)).name.lower()
mmproj_name = Path(str(mmproj_path)).name.lower()

families = (
    "qwen",
    "gemma",
    "llama",
    "mistral",
    "glm",
    "internvl",
    "deepseek",
    "phi",
    "minicpm",
    "llava",
)

model_family = next((f for f in families if f in model_name), None)
mmproj_family = next((f for f in families if f in mmproj_name), None)

if model_family and mmproj_family and model_family != mmproj_family:
    logger.warning(
        f"Skipping mismatched mmproj: model={model_name}, mmproj={mmproj_name}"
    )
else:
    cmd.extend(["--mmproj", mmproj_path])
    logger.info(f"Using mmproj for vision: {mmproj_path}")

A better fix would actively search for a matching local projector. For my directory, the intended mapping is:

Qwen3.5-9B-Q4_K_M.gguf
→ Qwen3.5-9B-BF16-mmproj.gguf

and:

Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
→ Qwen3.5-35B-A3B-BF16-mmproj.gguf

Workaround used locally

I patched studio/backend/core/inference/llama_cpp.py so that it:

  1. detects the selected model family,
  2. rejects an mmproj with a conflicting family,
  3. searches the selected model’s directory for a better matching local projector.

After that patch:

Qwen3.5-9B-Q4_K_M.gguf

loads with:

Qwen3.5-9B-BF16-mmproj.gguf

instead of the unrelated Gemma projector.

Why this matters

Many users keep GGUF files in a single local model library rather than one Hugging Face-style directory per model. In that layout, Studio should not assume that any sibling mmproj belongs to the selected GGUF. Incorrect projector selection causes confusing load failures and makes otherwise valid local GGUF models appear broken.

llama_cpp_mmproj_matcher.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions