clip : fix confused naming ffn_up and ffn_down by ngxson · Pull Request #13290 · ggml-org/llama.cpp

ngxson · 2025-05-03T21:37:54Z

The old clip.cpp code reverse the ffn_up and ffn_down naming by mistake, which make it extremely messy when migrating the conversion script to convert_hf_to_gguf.py

To save myself from some headaches, so I decided to fix it once and for all 😤😤

The new rule is:

New model always have fc1 == ffn_up and fc2 == ffn_down
For old models, if the reversed ffn naming is detected, we reverse it back

This PR also contain name changing to make it more align with llama.cpp style:

n_intermediate --> n_embd
hidden_size --> n_ff

Small note: GGUF converted from the old qwen surgery script has n_ff = 0, hopefully this will not be messy in the future

Tested by converting fresh new GGUF and run it with llama-mtmd-cli locally:

Gemma 3 4B
SmolVLM2 2.2B
Qwen 2.5 3B

Tested with existing GGUF on the internet:

OK:   llama-mtmd-cli ggml-org/SmolVLM-500M-Instruct-GGUF:Q8_0
OK:   llama-mtmd-cli ggml-org/SmolVLM2-2.2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/SmolVLM2-500M-Video-Instruct-GGUF:Q8_0
OK:   llama-mtmd-cli ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
OK:   llama-mtmd-cli guinmoon/MobileVLM-3B-GGUF:Q4_K_M
OK:   llama-mtmd-cli THUDM/glm-edge-v-5b-gguf:Q4_K_M
OK:   llama-mtmd-cli second-state/Llava-v1.5-7B-GGUF:Q2_K
OK:   llama-mtmd-cli cjpais/llava-1.6-mistral-7b-gguf:Q3_K
OK:   llama-mtmd-cli ibm-research/granite-vision-3.2-2b-GGUF:Q4_K_M
OK:   llama-mtmd-cli second-state/MiniCPM-Llama3-V-2_5-GGUF:Q2_K
OK:   llama-mtmd-cli openbmb/MiniCPM-V-2_6-gguf:Q2_K
OK:   llama-mtmd-cli openbmb/MiniCPM-o-2_6-gguf:Q4_0
OK:   llama-mtmd-cli bartowski/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/pixtral-12b-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Mistral-Small-3.1-24B-Instruct-2503-GGUF
OK:   llama-mtmd-cli ggml-org/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2-VL-7B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-7B-Instruct-GGUF:Q4_K_M

ngxson · 2025-05-05T10:38:26Z

Hi @slaren @ggerganov , sorry for pinging but I realized that some models current does not work correctly without this fix.

If you have time, could you please review this PR? Thanks!

* origin/master: (27 commits) llama : fix build_ffn without gate (ggml-org#13336) CUDA: fix bad asserts for partial offload (ggml-org#13337) convert : qwen2/3moe : set yarn metadata if present (ggml-org#13331) CUDA: fix --split-mode row for MMQ (ggml-org#13323) gguf-py : avoid requiring pyside6 for other scripts (ggml-org#13036) CUDA: fix logic for clearing padding with -ngl 0 (ggml-org#13320) sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (ggml-org#13264) server : Webui - change setText command from parent window to also send the message. (ggml-org#13309) mtmd : rename llava directory to mtmd (ggml-org#13311) clip : fix confused naming ffn_up and ffn_down (ggml-org#13290) convert : bailingmoe : set yarn metadata if present (ggml-org#13312) SYCL: Disable mul_mat kernels for noncontiguous tensor b (ggml-org#13308) mtmd : add C public API (ggml-org#13184) rpc : use backend registry, support dl backends (ggml-org#13304) ggml : activate s390x simd for Q3_K (ggml-org#13301) llava/mtmd : fixes to fully support dl backends (ggml-org#13303) llama : build windows releases with dl backends (ggml-org#13220) CUDA: fix race condition in MMQ stream-k fixup (ggml-org#13299) CUDA: fix race condition in MMQ ids_dst (ggml-org#13294) vulkan: Additional type support for unary, binary, and copy (ggml-org#13266) ...

* clip : fix confused naming ffn_up and ffn_down * rm ffn_i/o/g naming * rename n_embd, n_ff * small fix * no check n_ff

ngxson added 6 commits May 3, 2025 23:11

clip : fix confused naming ffn_up and ffn_down

3ee071c

rm ffn_i/o/g naming

3fbf0bd

rename n_embd, n_ff

f3870a6

small fix

ae83229

Merge branch 'master' into xsn/clip_ffn_up_down_fix

0009f76

no check n_ff

246a4e0

ngxson requested a review from ggerganov May 3, 2025 21:37

github-actions bot added examples python python script changes labels May 3, 2025

slaren approved these changes May 5, 2025

View reviewed changes

ngxson merged commit 5215b91 into ggml-org:master May 5, 2025
57 of 58 checks passed

timwu pushed a commit to timwu/llama.cpp that referenced this pull request Dec 20, 2025

clip : fix confused naming ffn_up and ffn_down (ggml-org#13290)

058f7cf

* clip : fix confused naming ffn_up and ffn_down * rm ffn_i/o/g naming * rename n_embd, n_ff * small fix * no check n_ff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clip : fix confused naming ffn_up and ffn_down#13290

clip : fix confused naming ffn_up and ffn_down#13290
ngxson merged 6 commits intoggml-org:masterfrom
ngxson:xsn/clip_ffn_up_down_fix

ngxson commented May 3, 2025 •

edited

Loading

Uh oh!

ngxson commented May 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ngxson commented May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngxson commented May 3, 2025 •

edited

Loading

ngxson commented May 5, 2025 •

edited

Loading