Unsloth Studio incorrectly auto-selects unrelated mmproj from flat local GGUF directory

Note: Please do not remove the questions. Answer beside them.
1. Did you update? `pip install --upgrade unsloth unsloth_zoo`  = yes
2. `Colab` or `Kaggle` or local / cloud = local
3. Number GPUs used, use `nvidia-smi` = 1
4. Which notebook? Please link! = ?
5. Which Unsloth version, TRL version, transformers version, PyTorch version? = 2026.5.2
6. Which trainer? `SFTTrainer`, `GRPOTrainer` etc = NA


## Summary

When loading a local GGUF model in Unsloth Studio from a directory containing multiple unrelated GGUF models, Studio may automatically attach an unrelated `mmproj` file from the same directory.

In my case, I selected a Qwen3.5 GGUF, but Studio launched `llama-server` with a Gemma mmproj:

```bash
/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \
  -m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \
  --port 51311 \
  -c 254976 \
  --parallel 1 \
  --flash-attn on \
  --no-context-shift \
  -ngl -1 \
  --threads -1 \
  --jinja \
  --chat-template-kwargs '{"enable_thinking": true}' \
  --mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf
```

This causes the model load to fail because the projector does not match the selected model family.

## Environment

- Running Unsloth Studio in Docker Desktop on Windows
- Image: `unsloth/unsloth:latest`
- Host model directory bind-mounted into container:

```yaml
volumes:
  - type: bind
    source: E:\llm_models
    target: /workspace/llm-models
```

- Relevant backend file in container:

```text
/opt/venv/lib/python3.12/site-packages/studio/backend/core/inference/llama_cpp.py
```

## Local GGUF directory layout

My model folder is intentionally a flat local model library:

```text
E:\llm_models
├── Qwen3.5-27B-IQ4_XS.gguf
├── Qwen3.5-27B-Q3_K_M.gguf
├── Qwen3.5-27B-Q4_K_M.gguf
├── Qwen3.5-27B-UD-IQ2_M.gguf
├── Qwen3.5-27B-UD-IQ3_XXS.gguf
├── Qwen3.5-35B-A3B-BF16-mmproj.gguf
├── Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
├── Qwen3.5-9B-BF16-mmproj.gguf
├── Qwen3.5-9B-Q4_K_M.gguf
└── gemma-4-26B-A4B-it.mmproj-q8_0.gguf
```

The selected model was:

```text
/workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf
```

The correct matching projector should be:

```text
/workspace/llm-models/Qwen3.5-9B-BF16-mmproj.gguf
```

But Studio selected:

```text
/workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf
```

## Expected behavior

When loading a local GGUF file, Studio should only attach an `mmproj` when it can confidently match the projector to the selected model.

For example:

```text
Qwen3.5-9B-Q4_K_M.gguf
→ Qwen3.5-9B-BF16-mmproj.gguf
```

and:

```text
Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
→ Qwen3.5-35B-A3B-BF16-mmproj.gguf
```

Studio should not attach:

```text
gemma-4-26B-A4B-it.mmproj-q8_0.gguf
```

to any Qwen model.

If no matching projector can be found, Studio should load the GGUF without `--mmproj`.

## Actual behavior

Studio appended the unrelated Gemma projector to the Qwen model launch command:

```bash
--mmproj /workspace/llm-models/gemma-4-26B-A4B-it.mmproj-q8_0.gguf
```

The model failed to load.

Running the same `llama-server` command manually **without** the Gemma `--mmproj` argument succeeds:

```bash
/home/unsloth/.unsloth/llama.cpp/build/bin/llama-server \
  -m /workspace/llm-models/Qwen3.5-9B-Q4_K_M.gguf \
  --host 0.0.0.0 \
  --port 51311 \
  -c 8192 \
  --parallel 1 \
  --flash-attn on \
  --no-context-shift \
  -ngl -1 \
  --threads -1 \
  --jinja \
  --chat-template-kwargs '{"enable_thinking": true}'
```

## Likely cause

The local GGUF discovery / launcher path appears to scan the selected model’s parent directory and chooses an available `*mmproj*.gguf` without sufficiently validating that it belongs to the selected base model.

The current launcher block appears to only check that `mmproj_path` exists:

```python
if mmproj_path:
    if not Path(mmproj_path).is_file():
        logger.warning(f"mmproj file not found: {mmproj_path}")
    else:
        cmd.extend(["--mmproj", mmproj_path])
        logger.info(f"Using mmproj for vision: {mmproj_path}")
```

That allows an unrelated sibling projector in a flat local model directory to be attached to the wrong model.

## Suggested fix

For local GGUF files, Studio should resolve `mmproj` more conservatively:

1. If the selected model is a loose local `.gguf` file, search only for projector files that match the selected model’s base identity.
2. Prefer projectors that share the same model family and size, for example:
   - `Qwen3.5-9B-Q4_K_M.gguf`
   - `Qwen3.5-9B-BF16-mmproj.gguf`
3. Do not attach a projector if the model family conflicts:
   - block `qwen` model + `gemma` mmproj
   - block `gemma` model + `qwen` mmproj
4. If no confident match is found, load the model without `--mmproj`.
5. Ideally expose the selected `mmproj` in the UI and allow clearing or overriding it.

A basic family guard would already prevent the bad case:

```python
model_name = Path(str(model_path)).name.lower()
mmproj_name = Path(str(mmproj_path)).name.lower()

families = (
    "qwen",
    "gemma",
    "llama",
    "mistral",
    "glm",
    "internvl",
    "deepseek",
    "phi",
    "minicpm",
    "llava",
)

model_family = next((f for f in families if f in model_name), None)
mmproj_family = next((f for f in families if f in mmproj_name), None)

if model_family and mmproj_family and model_family != mmproj_family:
    logger.warning(
        f"Skipping mismatched mmproj: model={model_name}, mmproj={mmproj_name}"
    )
else:
    cmd.extend(["--mmproj", mmproj_path])
    logger.info(f"Using mmproj for vision: {mmproj_path}")
```

A better fix would actively search for a matching local projector. For my directory, the intended mapping is:

```text
Qwen3.5-9B-Q4_K_M.gguf
→ Qwen3.5-9B-BF16-mmproj.gguf
```

and:

```text
Qwen3.5-35B-A3B-UD-Q4_K_L.gguf
→ Qwen3.5-35B-A3B-BF16-mmproj.gguf
```

## Workaround used locally

I patched `studio/backend/core/inference/llama_cpp.py` so that it:

1. detects the selected model family,
2. rejects an `mmproj` with a conflicting family,
3. searches the selected model’s directory for a better matching local projector.

After that patch:

```text
Qwen3.5-9B-Q4_K_M.gguf
```

loads with:

```text
Qwen3.5-9B-BF16-mmproj.gguf
```

instead of the unrelated Gemma projector.

## Why this matters

Many users keep GGUF files in a single local model library rather than one Hugging Face-style directory per model. In that layout, Studio should not assume that any sibling `mmproj` belongs to the selected GGUF. Incorrect projector selection causes confusing load failures and makes otherwise valid local GGUF models appear broken.

[llama_cpp_mmproj_matcher.py](https://github.com/user-attachments/files/27555522/llama_cpp_mmproj_matcher.py)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unsloth Studio incorrectly auto-selects unrelated mmproj from flat local GGUF directory #5347

Summary

Environment

Local GGUF directory layout

Expected behavior

Actual behavior

Likely cause

Suggested fix

Workaround used locally

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Unsloth Studio incorrectly auto-selects unrelated mmproj from flat local GGUF directory #5347

Description

Summary

Environment

Local GGUF directory layout

Expected behavior

Actual behavior

Likely cause

Suggested fix

Workaround used locally

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions