convert : fix Pixtral 12B --mistral-format conversion (3 bugs) by fredzillman · Pull Request #22981 · ggml-org/llama.cpp

fredzillman · 2026-05-12T17:06:18Z

Summary

Three small fixes in convert_hf_to_gguf.py to make --mistral-format work
end-to-end on Pixtral 12B (2409) consolidated weights. Without these,
conversion crashes before writing the GGUF; with them, F16 -> Q4_K_M -> mtmd
inference produces correct image output. No inference-side changes required --
tools/mtmd/clip.cpp already reads the bias tensors when present.

Bugs

1. `LlamaModel.init` crashes on mistral-format input (line ~2867)

ModelBase.load_hparams(self.dir_model, is_mistral_format=False) is called
unconditionally to read architectures for origin_hf_arch (used downstream
to detect SmolVLM2 etc.). Mistral consolidated layouts have no config.json,
so this raises FileNotFoundError for any --mistral-format run.

Fix: skip the HF-only lookup when self.is_mistral_format is True and set
self.origin_hf_arch = None. The field is only consulted by HF-specific
branches, so None is safe in the mistral path.

2. `PixtralModel.set_gguf_parameters` requires `mm_projector_id` (line ~13321)

self.find_vparam([mm_projector_id]) is mandatory, but Pixtral 12B 2409
uses a plain linear projector and ships no mm_projector_id in params.json
-- only the newer Mistral Small 3.1 sets it to patch_merge. Result: KeyError
on every Pixtral 12B conversion.

Fix: pass optional=True. The body inside the if only runs when the value
equals patch_merge, so None short-circuits correctly and the existing
patch-merge path is unaffected.

3. `PixtralModel.map_tensor_name` rejects adapter biases (line ~13330)

Only .weight is mapped for vision_language_adapter.w_in / w_out; .bias
falls through to super().map_tensor_name and raises. Pixtral 12B 2409's
consolidated.safetensors ships both biases.

Fix: add the two .bias branches mapping to mm.1.bias / mm.2.bias. The
inference side already supports this -- see tools/mtmd/clip.cpp:2143-2148,
which loads mm.1.bias / mm.2.bias with the optional flag set. The GGUF
writer was simply refusing tensors the runtime is happy to consume.

Reproduction

python convert_hf_to_gguf.py \
    /path/to/pixtral-12b-2409 \
    --mistral-format --outtype f16 \
    --outfile pixtral-12b-2409-f16.gguf

against the official Mistral release of Pixtral-12B-2409 (consolidated
safetensors layout, no config.json). Pre-patch this fails at
LlamaModel.__init__ (FileNotFoundError on config.json). After bypassing
that it fails at set_gguf_parameters (KeyError mm_projector_id). After
bypassing that it fails inside map_tensor_name on
vision_language_adapter.w_in.bias.

Verification

F16 GGUF written successfully from consolidated.safetensors +
params.json + tekken.json.
Q4_K_M produced via llama-quantize, loads cleanly.
llama-mtmd-cli smoke test on a test image returned correct,
image-grounded output. CUDA inference at ~47 tok/s eval on RTX 3090.
No upstream files touched besides convert_hf_to_gguf.py.

Inference side

tools/mtmd/clip.cpp:2143-2148 already calls
get_tensor(TN_MM_INP_PROJ_B, /*optional=*/true) and
get_tensor(TN_MM_OUTP_PROJ_B, /*optional=*/true) for mm.1.bias and
mm.2.bias, so the converter change makes existing runtime behavior
reachable. No C++ change is included or needed.

Ports 15 upstream commits (05e141a..5d44db6) that touched the monolithic convert_hf_to_gguf.py into the new conversion/*.py layout introduced by the refactor split. New text/mmproj architectures registered: GraniteSpeechForConditionalGeneration, MiMoV2ForCausalLM, MiniCPMV4_6ForConditionalGeneration, Sarashina2VisionForCausalLM, SarvamMoEForCausalLM (+ modeling_sarvam_moe.SarvamMoEForCausalLM). Notable changes: - filter_tensors classmethod added to ModelBase/TextModel/MmprojModel and wired into index_tensors; many model classes refactored to move tensor-name skip/rename logic out of modify_tensors and into filter_tensors (upstream ggml-org#22597). - LlamaModel._repack_nvfp4 override (Q/K RoPE permutation, ggml-org#22611). - MistralModel yarn apply_scale support (ggml-org#22612). - Gemma4Model._generate_nvfp4_tensors override for 26B NVFP4 (ggml-org#22804). - LlavaVisionModel image-break token fallback for Mistral params.json -1 placeholders (ggml-org#22914). - Pixtral 12B --mistral-format conversion fixes (ggml-org#22981). - FP8 KV-cache scales fix (ggml-org#22818) and uint dtype byteswap disable (ggml-org#18908). New files: conversion/sarashina2.py (Sarashina2VL text + vision)

…org#22981)

…org#22981) (cherry picked from commit cce09f0)

…org#22981)

convert : fix Pixtral 12B --mistral-format conversion (3 bugs)

23d09b2

fredzillman requested a review from CISC as a code owner May 12, 2026 17:06

CISC approved these changes May 12, 2026

View reviewed changes

CISC requested a review from ngxson May 12, 2026 17:34

ngxson approved these changes May 12, 2026

View reviewed changes

github-actions Bot added the python python script changes label May 12, 2026

ngxson merged commit cce09f0 into ggml-org:master May 12, 2026
6 checks passed

xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 13, 2026

convert : fix Pixtral 12B --mistral-format conversion (3 bugs) (ggml-…

4b70c44

…org#22981)

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026

convert : fix Pixtral 12B --mistral-format conversion (3 bugs) (ggml-…

fba259e

…org#22981)

baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026

convert : fix Pixtral 12B --mistral-format conversion (3 bugs) (ggml-…

259a153

…org#22981)

carlosfundora pushed a commit to carlosfundora/llama.cpp-1-bit-turbo that referenced this pull request May 24, 2026

convert : fix Pixtral 12B --mistral-format conversion (3 bugs) (ggml-…

85d9f34

…org#22981) (cherry picked from commit cce09f0)

winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026

convert : fix Pixtral 12B --mistral-format conversion (3 bugs) (ggml-…

b07530f

…org#22981)

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

convert : fix Pixtral 12B --mistral-format conversion (3 bugs) (ggml-…

9c275d4

…org#22981)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : fix Pixtral 12B --mistral-format conversion (3 bugs)#22981

convert : fix Pixtral 12B --mistral-format conversion (3 bugs)#22981
ngxson merged 1 commit into
ggml-org:masterfrom
fredzillman:fix/pixtral-mistral-format-conversion

fredzillman commented May 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fredzillman commented May 12, 2026

Summary

Bugs

1. LlamaModel.__init__ crashes on mistral-format input (line ~2867)

2. PixtralModel.set_gguf_parameters requires mm_projector_id (line ~13321)

3. PixtralModel.map_tensor_name rejects adapter biases (line ~13330)

Reproduction

Verification

Inference side

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. `LlamaModel.init` crashes on mistral-format input (line ~2867)

2. `PixtralModel.set_gguf_parameters` requires `mm_projector_id` (line ~13321)

3. `PixtralModel.map_tensor_name` rejects adapter biases (line ~13330)