Skip to content

[Question]Qwen2.5-VL cannot be saved in gguf format #2526

@AGI-is-going-to-arrive

Description

@AGI-is-going-to-arrive

unsloth==2025.5.1
unsloth_zoo==2025.5.1

After I fine-tuned the unsloth/Qwen2.5-VL-3B-Instruct model using unsloth, it runs normally. However, when I try to save it in GGUF format using the following command:

model.save_pretrained_gguf("./save_dir", quantization_type="q8_0")

The following error appears:


RuntimeError Traceback (most recent call last)
Cell In[25], line 1
----> 1 model.save_pretrained_gguf("./save_dir", quantization_type="q8_0")

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator..decorate_context(*args, **kwargs)
113 @functools.wraps(func)
114 def decorate_context(*args, **kwargs):
115 with ctx_factory():
--> 116 return func(*args, **kwargs)

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/unsloth/save.py:2247, in save_to_gguf_generic(model, save_directory, quantization_type, repo_id, token)
2244 install_llama_cpp(just_clone_repo = True)
2245 pass
-> 2247 metadata = _convert_to_gguf(
2248 save_directory,
2249 print_output = True,
2250 quantization_type = quantization_type,
2251 )
2252 if repo_id is not None:
2253 prepare_saving(
2254 model,
2255 repo_id,
2256 is_gguf = True,
2257 save_directory = save_directory,
2258 metadata = metadata,
2259 token = token,
2260 )

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/unsloth_zoo/llama_cpp.py:692, in convert_to_gguf(input_folder, output_filename, quantization_type, max_shard_size, print_output, print_outputs)
689 pass
691 if metadata is None:
--> 692 raise RuntimeError(f"Unsloth: Failed to convert {conversion_filename} to GGUF.")
694 printed_metadata = "\n".join(metadata)
695 if print_output: print(f"Unsloth: Successfully saved GGUF to:\n{printed_metadata}")

RuntimeError: Unsloth: Failed to convert llama.cpp/unsloth_convert_hf_to_gguf.py to GGUF.

Or when I try to upload it to Hugging Face using the following command:

model.push_to_hub_gguf("haha/qwen2.5-vl-gguf-q8", tokenizer, quantization_type = "Q8_0")

The following error appears:


TypeError Traceback (most recent call last)
Cell In[27], line 1
----> 1 model.push_to_hub_gguf("haha/qwen2.5-vl-gguf-q8", tokenizer, quantization_type = "Q8_0")

File ~/anaconda3/envs/mocr/lib/python3.10/site-packages/torch/utils/_contextlib.py:116, in context_decorator..decorate_context(*args, **kwargs)
113 @functools.wraps(func)
114 def decorate_context(*args, **kwargs):
115 with ctx_factory():
--> 116 return func(*args, **kwargs)

TypeError: save_to_gguf_generic() got multiple values for argument 'quantization_type'

Besides using q8_0, I have also tried bf16 and f16, and the above errors still occur.

However, I can successfully save it in the safetensors format using the following command:

model.save_pretrained_merged("qwen2.5-vl-sat", tokenizer)

This command runs successfully and saves the model.

Is it possible that unsloth does not support saving Qwen2.5-VL models in GGUF format?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions