convert : bailingmoe : set yarn metadata if present by CISC · Pull Request #13312 · ggml-org/llama.cpp

CISC · 2025-05-05T10:29:54Z

Set YaRN metadata if present since support was finally added and it looks to be bog standard. Tested with Ling-Coder-lite (manually configured).

NOTE: For some reason they have not updated the config on any of their models, so you have to add/enable this yourself, can be done in one of the following ways:

On the commandline (no changes to old GGUFs necessary):

./llama-cli -m Ling-Coder-lite.gguf -c 16384 --rope-scaling yarn --rope-scale 4

Change max_position_embeddings (multiply by 4) and rope_scaling in config.json and reconvert:

{
    "factor": 4.0,
    "original_max_position_embeddings": 4096,
    "type": "yarn"
}

Add/change metadata to old GGUF with gguf_editor_gui.py
Download dynamically modified GGUF using gguf-editor

* origin/master: (27 commits) llama : fix build_ffn without gate (ggml-org#13336) CUDA: fix bad asserts for partial offload (ggml-org#13337) convert : qwen2/3moe : set yarn metadata if present (ggml-org#13331) CUDA: fix --split-mode row for MMQ (ggml-org#13323) gguf-py : avoid requiring pyside6 for other scripts (ggml-org#13036) CUDA: fix logic for clearing padding with -ngl 0 (ggml-org#13320) sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (ggml-org#13264) server : Webui - change setText command from parent window to also send the message. (ggml-org#13309) mtmd : rename llava directory to mtmd (ggml-org#13311) clip : fix confused naming ffn_up and ffn_down (ggml-org#13290) convert : bailingmoe : set yarn metadata if present (ggml-org#13312) SYCL: Disable mul_mat kernels for noncontiguous tensor b (ggml-org#13308) mtmd : add C public API (ggml-org#13184) rpc : use backend registry, support dl backends (ggml-org#13304) ggml : activate s390x simd for Q3_K (ggml-org#13301) llava/mtmd : fixes to fully support dl backends (ggml-org#13303) llama : build windows releases with dl backends (ggml-org#13220) CUDA: fix race condition in MMQ stream-k fixup (ggml-org#13299) CUDA: fix race condition in MMQ ids_dst (ggml-org#13294) vulkan: Additional type support for unary, binary, and copy (ggml-org#13266) ...

set yarn metadata if present

c2bf512

github-actions bot added the python python script changes label May 5, 2025

CISC requested a review from ngxson May 5, 2025 10:30

ngxson approved these changes May 5, 2025

View reviewed changes

CISC merged commit ae803bf into master May 5, 2025
7 checks passed

CISC deleted the cisc/convert-bailingmoe-yarn branch May 5, 2025 10:34

timwu pushed a commit to timwu/llama.cpp that referenced this pull request Dec 20, 2025

convert : bailingmoe : set yarn metadata if present (ggml-org#13312)

ad5249b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : bailingmoe : set yarn metadata if present#13312

convert : bailingmoe : set yarn metadata if present#13312
CISC merged 1 commit intomasterfrom
cisc/convert-bailingmoe-yarn

CISC commented May 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CISC commented May 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants