Skip to content

Eval bug: Mistral Medium 3.5 128B - mmproj model fails to run with llama.cpp tot release and source code. #22901

@bpawar-nvidia

Description

@bpawar-nvidia

Name and Version

llama.cpp version - b9038 .

Operating systems

Windows

GGML backends

CUDA

Hardware

RTX 5090

Models

Mistral Medium 3.5 128B Q4_K_M
mmproj.GGUF

Problem description & steps to reproduce

Use official HF model as input - https://huggingface.co/mistralai/Mistral-Medium-3.5-128B

Create GGUF model from HF model -
convert_hf_to_gguf.py /path/to/hf-model-directory --outfile /path/to/model.gguf --outtype auto

Create mmproj GGUF model using below command -
convert_hf_to_gguf.py /path/to/hf-model-directory --outfile /path/to/mmproj.gguf --mistral-common --mmproj --outtype auto

Create Q4_K_M model using below command
llama-quantize --output-tensor-type q8_0 model.gguf output-q4_k_m.gguf q4_k_m

Try to run inference using llama-mtmd-cli.exe

llama-mtmd-cli.exe -m "Path/to/Mistral-Medium-3.5-128B-Q4_K_M.gguf" --mmproj "Path/to/Mistral-Medium-3.5-128B-mmproj.gguf" --image "test.jpg" -p "What is shown in this image ?" --jinja

We get this error -

clip_init: failed to load model 'D:\Models\LLM-VLM-models\DawnRidge\HF-Official-GGUF-model\Mistral-Medium-3.5-128B-mmproj.gguf': operator(): unable to find tensor v.token_embd.img_break

�[0mmtmd_init_from_file: error: Failed to load CLIP model from D:\Models\LLM-VLM-models\DawnRidge\HF-Official-GGUF-model\Mistral-Medium-3.5-128B-mmproj.gguf

First Bad Commit

Not known

Relevant log output

llama-mtmd-cli-verbose.txt
llama-mtmd-cli-error.txt

Logs

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions