[Question]Qwen2.5-VL cannot be saved in gguf format

unsloth==2025.5.1
unsloth_zoo==2025.5.1


After I fine-tuned the **unsloth/Qwen2.5-VL-3B-Instruc**t model using unsloth, it runs normally. However, when I try to **save it in GGUF format** using the following command:

model.save_pretrained_gguf("./save_dir", quantization_type="q8_0")


The following error appears:


---------------------------------------------------------------------------
RuntimeError                                Traceback (most recent call last)
Cell In[25], line 1
----> 1 model.save_pretrained_gguf("./save_dir", quantization_type="q8_0")

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    113 @functools.wraps(func)
    114 def decorate_context(*args, **kwargs):
    115     with ctx_factory():
--> 116         return func(*args, **kwargs)

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/unsloth/save.py:2247, in save_to_gguf_generic(model, save_directory, quantization_type, repo_id, token)
   2244     install_llama_cpp(just_clone_repo = True)
   2245 pass
-> 2247 metadata = _convert_to_gguf(
   2248     save_directory,
   2249     print_output = True,
   2250     quantization_type = quantization_type,
   2251 )
   2252 if repo_id is not None:
   2253     prepare_saving(
   2254         model,
   2255         repo_id,
   2256         is_gguf = True,
   2257         save_directory = save_directory,
   2258         metadata = metadata,
   2259         token = token,
   2260     )

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/unsloth_zoo/llama_cpp.py:692, in convert_to_gguf(input_folder, output_filename, quantization_type, max_shard_size, print_output, print_outputs)
    689 pass
    691 if metadata is None:
--> 692     raise RuntimeError(f"Unsloth: Failed to convert {conversion_filename} to GGUF.")
    694 printed_metadata = "\n".join(metadata)
    695 if print_output: print(f"Unsloth: Successfully saved GGUF to:\n{printed_metadata}")

RuntimeError: Unsloth: Failed to convert llama.cpp/unsloth_convert_hf_to_gguf.py to GGUF.


Or when I try to upload it to Hugging Face using the following command:


model.push_to_hub_gguf("haha/qwen2.5-vl-gguf-q8", tokenizer, quantization_type = "Q8_0")


The following error appears:


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[27], line 1
----> 1 model.push_to_hub_gguf("haha/qwen2.5-vl-gguf-q8", tokenizer, quantization_type = "Q8_0")

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    113 @functools.wraps(func)
    114 def decorate_context(*args, **kwargs):
    115     with ctx_factory():
--> 116         return func(*args, **kwargs)

TypeError: save_to_gguf_generic() got multiple values for argument 'quantization_type'


Besides using `q8_0`, I have also tried `bf16` and `f16`, and the above errors still occur.

However, I can successfully save it in the **safetensors** format using the following command:


model.save_pretrained_merged("qwen2.5-vl-sat", tokenizer)


This command runs successfully and saves the model.

Is it possible that unsloth **does not support** saving **Qwen2.5-VL** models in **GGUF** format?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question]Qwen2.5-VL cannot be saved in gguf format #2526

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Question]Qwen2.5-VL cannot be saved in gguf format #2526

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions