Skip to content

convert : store ffn_gate_inp_shexp as F32#19606

Merged
CISC merged 1 commit intomasterfrom
cisc/convert-gate-inp-shexp-f32
Feb 14, 2026
Merged

convert : store ffn_gate_inp_shexp as F32#19606
CISC merged 1 commit intomasterfrom
cisc/convert-gate-inp-shexp-f32

Conversation

@CISC
Copy link
Member

@CISC CISC commented Feb 13, 2026

This tensor has been inadvertently stored as BF16 in several models, and since it's 1D it will never be quantized.

@CISC CISC requested a review from ggerganov February 13, 2026 21:44
@github-actions github-actions bot added the python python script changes label Feb 13, 2026
@CISC CISC merged commit 0d00ef6 into master Feb 14, 2026
9 checks passed
@ggerganov
Copy link
Member

@CISC High-jacking this thread to report a minor regression that I observe. I need the following patch to be able to convert https://huggingface.co/Qwen/Qwen3-30B-A3B-Base:

diff --git a/convert_hf_to_gguf.py b/convert_hf_to_gguf.py
index 825080b58..1e5209cdc 100755
--- a/convert_hf_to_gguf.py
+++ b/convert_hf_to_gguf.py
@@ -4161,7 +4161,7 @@ class Qwen2MoeModel(TextModel):
             return
 
         if name.find("experts") != -1:
-            n_experts = self.hparams["num_experts"]
+            n_experts = self.find_hparam(["num_local_experts", "num_experts"])
             assert bid is not None
 
             if self._experts is None:

@CISC CISC deleted the cisc/convert-gate-inp-shexp-f32 branch February 14, 2026 07:17
@CISC
Copy link
Member Author

CISC commented Feb 14, 2026

@CISC High-jacking this thread to report a minor regression that I observe. I need the following patch to be able to convert https://huggingface.co/Qwen/Qwen3-30B-A3B-Base:

Hmmm, ok, not sure why that would have regressed, I'll look into it.

@ggerganov
Copy link
Member

Last I tested in mid-Jan, it was converting successfully with this commit: 2bbe4c2. Not sure if a problem in the convert script, or in something in the Python dependencies has changed.

@CISC
Copy link
Member Author

CISC commented Feb 14, 2026

Last I tested in mid-Jan, it was converting successfully with this commit: 2bbe4c2. Not sure if a problem in the convert script, or in something in the Python dependencies has changed.

Are you sure? I see the non-base model uses num_experts, I think this just changed in transformers by the time they released the base model.

@CISC
Copy link
Member Author

CISC commented Feb 14, 2026

Last I tested in mid-Jan, it was converting successfully with this commit: 2bbe4c2. Not sure if a problem in the convert script, or in something in the Python dependencies has changed.

Are you sure? I see the non-base model uses num_experts, I think this just changed in transformers by the time they released the base model.

Edit: Ah, I see the problem, it's transformers itself that changed, it's num_experts in the config, but AutoConfig returns num_local_experts.

liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants