我在使用unsloth训练Qwen3-30B-A3B-bnb-4bit模型,前面都一切正常,最后导出gguf时出现下面错误,我也有先使用model.save_pretrained_merged("/root/autodl-tmp/merged_model_4bit", tokenizer, save_method = "merged_4bit_forced",)先保存,在用llama.cpp转换,但一样是这个错误。
Unsloth GGUF:hf-to-gguf:Loading model: Qwen3-30B-A3B-bnb-4bit
Unsloth GGUF:hf-to-gguf:Model architecture: Qwen3MoeForCausalLM
Unsloth GGUF:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Unsloth GGUF:hf-to-gguf:Exporting model...
Unsloth GGUF:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
Unsloth GGUF:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {2048, 151936}
Traceback (most recent call last):
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6512, in
main()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6506, in main
model_instance.write()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 677, in write
self.prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3178, in prepare_tensors
super().prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 549, in prepare_tensors
for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3161, in modify_tensors
datas.append(self._experts[bid][ename])
~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'model.layers.0.mlp.experts.2.down_proj.weight'
Unsloth GGUF:hf-to-gguf:Loading model: Qwen3-30B-A3B-bnb-4bit
Unsloth GGUF:hf-to-gguf:Model architecture: Qwen3MoeForCausalLM
Unsloth GGUF:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Unsloth GGUF:hf-to-gguf:Exporting model...
Unsloth GGUF:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
Unsloth GGUF:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {2048, 151936}
Traceback (most recent call last):
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6512, in
main()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6506, in main
model_instance.write()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 677, in write
self.prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3178, in prepare_tensors
super().prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 549, in prepare_tensors
for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3161, in modify_tensors
datas.append(self._experts[bid][ename])
~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'model.layers.0.mlp.experts.2.down_proj.weight'
RuntimeError Traceback (most recent call last)
Cell In[20], line 1
----> 1 if True: model.save_pretrained_gguf("/root/autodl-tmp/Qwen3-30B-A3B-bnb-4bit", quantization_type="q8_0")
File ~/miniconda3/envs/llama-factory/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator..decorate_context(*args, **kwargs)
113 @functools.wraps(func)
114 def decorate_context(*args, **kwargs):
115 with ctx_factory():
--> 116 return func(*args, **kwargs)
File ~/miniconda3/envs/llama-factory/lib/python3.10/site-packages/unsloth/save.py:2247, in save_to_gguf_generic(model, save_directory, quantization_type, repo_id, token)
2244 install_llama_cpp(just_clone_repo = True)
2245 pass
-> 2247 metadata = _convert_to_gguf(
2248 save_directory,
2249 print_output = True,
2250 quantization_type = quantization_type,
2251 )
2252 if repo_id is not None:
2253 prepare_saving(
2254 model,
2255 repo_id,
(...)
2259 token = token,
2260 )
File ~/miniconda3/envs/llama-factory/lib/python3.10/site-packages/unsloth_zoo/llama_cpp.py:692, in convert_to_gguf(input_folder, output_filename, quantization_type, max_shard_size, print_output, print_outputs)
689 pass
691 if metadata is None:
--> 692 raise RuntimeError(f"Unsloth: Failed to convert {conversion_filename} to GGUF.")
694 printed_metadata = "\n".join(metadata)
695 if print_output: print(f"Unsloth: Successfully saved GGUF to:\n{printed_metadata}")
RuntimeError: Unsloth: Failed to convert llama.cpp/unsloth_convert_hf_to_gguf.py to GGUF.
我在使用unsloth训练Qwen3-30B-A3B-bnb-4bit模型,前面都一切正常,最后导出gguf时出现下面错误,我也有先使用model.save_pretrained_merged("/root/autodl-tmp/merged_model_4bit", tokenizer, save_method = "merged_4bit_forced",)先保存,在用llama.cpp转换,但一样是这个错误。
Unsloth GGUF:hf-to-gguf:Loading model: Qwen3-30B-A3B-bnb-4bit
Unsloth GGUF:hf-to-gguf:Model architecture: Qwen3MoeForCausalLM
Unsloth GGUF:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Unsloth GGUF:hf-to-gguf:Exporting model...
Unsloth GGUF:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
Unsloth GGUF:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {2048, 151936}
Traceback (most recent call last):
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6512, in
main()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6506, in main
model_instance.write()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 677, in write
self.prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3178, in prepare_tensors
super().prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 549, in prepare_tensors
for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3161, in modify_tensors
datas.append(self._experts[bid][ename])
~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'model.layers.0.mlp.experts.2.down_proj.weight'
Unsloth GGUF:hf-to-gguf:Loading model: Qwen3-30B-A3B-bnb-4bit
Unsloth GGUF:hf-to-gguf:Model architecture: Qwen3MoeForCausalLM
Unsloth GGUF:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Unsloth GGUF:hf-to-gguf:Exporting model...
Unsloth GGUF:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
Unsloth GGUF:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {2048, 151936}
Traceback (most recent call last):
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6512, in
main()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 6506, in main
model_instance.write()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 677, in write
self.prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3178, in prepare_tensors
super().prepare_tensors()
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 549, in prepare_tensors
for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/llama.cpp/unsloth_convert_hf_to_gguf.py", line 3161, in modify_tensors
datas.append(self._experts[bid][ename])
~~~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'model.layers.0.mlp.experts.2.down_proj.weight'
RuntimeError Traceback (most recent call last)
Cell In[20], line 1
----> 1 if True: model.save_pretrained_gguf("/root/autodl-tmp/Qwen3-30B-A3B-bnb-4bit", quantization_type="q8_0")
File ~/miniconda3/envs/llama-factory/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator..decorate_context(*args, **kwargs)
113 @functools.wraps(func)
114 def decorate_context(*args, **kwargs):
115 with ctx_factory():
--> 116 return func(*args, **kwargs)
File ~/miniconda3/envs/llama-factory/lib/python3.10/site-packages/unsloth/save.py:2247, in save_to_gguf_generic(model, save_directory, quantization_type, repo_id, token)
2244 install_llama_cpp(just_clone_repo = True)
2245 pass
-> 2247 metadata = _convert_to_gguf(
2248 save_directory,
2249 print_output = True,
2250 quantization_type = quantization_type,
2251 )
2252 if repo_id is not None:
2253 prepare_saving(
2254 model,
2255 repo_id,
(...)
2259 token = token,
2260 )
File ~/miniconda3/envs/llama-factory/lib/python3.10/site-packages/unsloth_zoo/llama_cpp.py:692, in convert_to_gguf(input_folder, output_filename, quantization_type, max_shard_size, print_output, print_outputs)
689 pass
691 if metadata is None:
--> 692 raise RuntimeError(f"Unsloth: Failed to convert {conversion_filename} to GGUF.")
694 printed_metadata = "\n".join(metadata)
695 if print_output: print(f"Unsloth: Successfully saved GGUF to:\n{printed_metadata}")
RuntimeError: Unsloth: Failed to convert llama.cpp/unsloth_convert_hf_to_gguf.py to GGUF.