Skip to content

[BUGFIX] Mistral format yarn apply_scale support.#22612

Merged
CISC merged 3 commits into
ggml-org:masterfrom
juliendenize:mistral_format_apply_scale
May 3, 2026
Merged

[BUGFIX] Mistral format yarn apply_scale support.#22612
CISC merged 3 commits into
ggml-org:masterfrom
juliendenize:mistral_format_apply_scale

Conversation

@juliendenize

Copy link
Copy Markdown
Contributor

Overview

This PR fixes the mistral format conversion to properly handle apply_scale for yarn scaling. Previously models that had in params.json apply_scale=True would have still disabled it leading to poor performance for mid-long context.

Additional information

Requirements

@juliendenize juliendenize requested a review from CISC as a code owner May 2, 2026 10:05
@github-actions github-actions Bot added the python python script changes label May 2, 2026
@juliendenize juliendenize reopened this May 2, 2026
@juliendenize

juliendenize commented May 2, 2026

Copy link
Copy Markdown
Contributor Author

Sorry for closing and opening back and forth i was just checking if my understanding was correct.

IIUC at llama-model.cpp line 2871:

ml.get_key(LLM_KV_ROPE_SCALING_YARN_LOG_MUL,   hparams.rope_yarn_log_mul, 0.0f);

does put the value to 0 when unset. So if I just set it explicitly as I did in the PR, it should have the same effect.

My goal here is making sure standard yarn scaling is applied.

@juliendenize juliendenize reopened this May 2, 2026
@CISC

CISC commented May 2, 2026

Copy link
Copy Markdown
Member
ml.get_key(LLM_KV_ROPE_SCALING_YARN_LOG_MUL,   hparams.rope_yarn_log_mul, 0.0f);

does put the value to 0 when unset. So if I just set it explicitly as I did in the PR, it should have the same effect.

I think someone misunderstood, the last parameter is a bool, false meaning the key is not required, leaving hparams.rope_yarn_log_mul untouched (defaults to 0.0f) if it doesn't exist.

Comment thread convert_hf_to_gguf.py Outdated
@isaac-mcfadyen

isaac-mcfadyen commented May 2, 2026

Copy link
Copy Markdown
Contributor

Note that Mistral applied a fix upstream themselves after Unsloth notified them of the issue: https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18

Once this is merged, will the combination of proper apply_scale=True and the mscale_all_dim=0.0 cause any issues?
EDIT: Ignore me, I re-read this and realized that's exactly what the apply_scale param would do with this PR...

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@juliendenize

juliendenize commented May 2, 2026

Copy link
Copy Markdown
Contributor Author

I think someone misunderstood, the last parameter is a bool, false meaning the key is not required, leaving hparams.rope_yarn_log_mul untouched (defaults to 0.0f) if it doesn't exist.

Yes i get it now, explicitly setting it to 0 ends up having the same behavior so that's ok ! thanks a lot for approving !
Edit ok based on your last commit i didn't get it thanks for the contrib ^^'

Note that Mistral applied a fix upstream themselves after Unsloth notified them of the issue: https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18
EDIT: Ignore me, I re-read this and realized that's exactly what the apply_scale param would do with this PR...

thanks for double checking, indeed this is related to the same issue. The fix in the model hub is for Transformers conversion. A while back, thanks to the help of CISC and Son we added an alternative mistral format for conversion in case Transformers weights are not accessible but it would end up having the same issue before this change.

@CISC CISC added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label May 2, 2026
Comment thread convert_hf_to_gguf.py
@CISC CISC merged commit 048a490 into ggml-org:master May 3, 2026
1 check passed
samuraieng pushed a commit to samuraieng/llama.cpp that referenced this pull request May 6, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
cetarthoriphros pushed a commit to cetarthoriphros/llama.cpp that referenced this pull request May 9, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
meh pushed a commit to meh/llama.cpp that referenced this pull request May 10, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
pwilkin added a commit to pwilkin/llama.cpp that referenced this pull request May 13, 2026
Ports 15 upstream commits (05e141a..5d44db6) that touched the
monolithic convert_hf_to_gguf.py into the new conversion/*.py layout
introduced by the refactor split.

New text/mmproj architectures registered:
  GraniteSpeechForConditionalGeneration, MiMoV2ForCausalLM,
  MiniCPMV4_6ForConditionalGeneration, Sarashina2VisionForCausalLM,
  SarvamMoEForCausalLM (+ modeling_sarvam_moe.SarvamMoEForCausalLM).

Notable changes:
- filter_tensors classmethod added to ModelBase/TextModel/MmprojModel
  and wired into index_tensors; many model classes refactored to move
  tensor-name skip/rename logic out of modify_tensors and into
  filter_tensors (upstream ggml-org#22597).
- LlamaModel._repack_nvfp4 override (Q/K RoPE permutation, ggml-org#22611).
- MistralModel yarn apply_scale support (ggml-org#22612).
- Gemma4Model._generate_nvfp4_tensors override for 26B NVFP4 (ggml-org#22804).
- LlavaVisionModel image-break token fallback for Mistral params.json
  -1 placeholders (ggml-org#22914).
- Pixtral 12B --mistral-format conversion fixes (ggml-org#22981).
- FP8 KV-cache scales fix (ggml-org#22818) and uint dtype byteswap disable
  (ggml-org#18908).

New files:
  conversion/sarashina2.py (Sarashina2VL text + vision)
baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
carlosfundora pushed a commit to carlosfundora/llama.cpp-1-bit-turbo that referenced this pull request May 24, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
(cherry picked from commit 048a490)
winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* [BUGFIX] Mistral format apply_scale support.

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix misunderstood boolean parameters

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants