When calling model.save_pretrained_gguf(), the function appears to run to completion but does not generate the expected modelfile in the output directory.
if True: # Change to True to save to GGUF
save_path = "/content/drive/MyDrive/model/gemma3/gemma-3-finetune"
model.save_pretrained_gguf(
save_path,
quantization_type = "f16", # For now only Q8_0, BF16, F16 supported
)
Unsloth: Updating system package directories
Unsloth: Install GGUF and other packages
Unsloth GGUF:hf-to-gguf:Loading model: gemma-3-finetune
Unsloth GGUF:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Unsloth GGUF:hf-to-gguf:Exporting model...
Unsloth GGUF:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00001-of-00002.safetensors'
Unsloth GGUF:hf-to-gguf:token_embd.weight, torch.bfloat16 --> F16, shape = {2560, 262208}
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00002-of-00002.safetensors'
Unsloth GGUF:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {2560}
Unsloth GGUF:hf-to-gguf:Set meta model
Unsloth GGUF:hf-to-gguf:Set model parameters
Unsloth GGUF:hf-to-gguf:Set model tokenizer
Unsloth GGUF:gguf.vocab:Setting special token type bos to 2
Unsloth GGUF:gguf.vocab:Setting special token type eos to 106
Unsloth GGUF:gguf.vocab:Setting special token type unk to 3
Unsloth GGUF:gguf.vocab:Setting special token type pad to 0
Unsloth GGUF:gguf.vocab:Setting add_bos_token to True
Unsloth GGUF:gguf.vocab:Setting add_eos_token to False
Unsloth GGUF:gguf.vocab:Setting chat_template to {{ bos_token }}
{%- if messages[0]['role'] == 'system' -%}
{%- if messages[0]['content'] is string -%}
{%- set first_user_prefix = messages[0]['content'] + '
' -%}
{%- else -%}
{%- set first_user_prefix = messages[0]['content'][0]['text'] + '
..... Chat template truncated .....
Unsloth GGUF:hf-to-gguf:Set model quantization version
Unsloth GGUF:gguf.gguf_writer:Writing the following files:
Unsloth GGUF:gguf.gguf_writer:/content/drive/MyDrive/model/gemma3/gemma-3-finetune.F16.gguf: n_tensors = 444, total_size = 7.8G
Unsloth: GGUF conversion: 100%
100/100 [00:42<00:00, 2.83it/s, 7.76G/7.76G]
Unsloth GGUF:hf-to-gguf:Model successfully exported to /content/drive/MyDrive/model/gemma3/
Unsloth: Converted to /content/drive/MyDrive/model/gemma3/gemma-3-finetune.F16.gguf with size = 7.8G
Unsloth: Successfully saved GGUF to:
/content/drive/MyDrive/model/gemma3/gemma-3-finetune.F16.gguf
When calling model.save_pretrained_gguf(), the function appears to run to completion but does not generate the expected modelfile in the output directory.
if True: # Change to True to save to GGUF
save_path = "/content/drive/MyDrive/model/gemma3/gemma-3-finetune"
model.save_pretrained_gguf(
save_path,
quantization_type = "f16", # For now only Q8_0, BF16, F16 supported
)
Unsloth: Updating system package directories
Unsloth: Install GGUF and other packages
Unsloth GGUF:hf-to-gguf:Loading model: gemma-3-finetune
Unsloth GGUF:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
Unsloth GGUF:hf-to-gguf:Exporting model...
Unsloth GGUF:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00001-of-00002.safetensors'
Unsloth GGUF:hf-to-gguf:token_embd.weight, torch.bfloat16 --> F16, shape = {2560, 262208}
Unsloth GGUF:hf-to-gguf:gguf: loading model part 'model-00002-of-00002.safetensors'
Unsloth GGUF:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {2560}
Unsloth GGUF:hf-to-gguf:Set meta model
Unsloth GGUF:hf-to-gguf:Set model parameters
Unsloth GGUF:hf-to-gguf:Set model tokenizer
Unsloth GGUF:gguf.vocab:Setting special token type bos to 2
Unsloth GGUF:gguf.vocab:Setting special token type eos to 106
Unsloth GGUF:gguf.vocab:Setting special token type unk to 3
Unsloth GGUF:gguf.vocab:Setting special token type pad to 0
Unsloth GGUF:gguf.vocab:Setting add_bos_token to True
Unsloth GGUF:gguf.vocab:Setting add_eos_token to False
Unsloth GGUF:gguf.vocab:Setting chat_template to {{ bos_token }}
{%- if messages[0]['role'] == 'system' -%}
{%- if messages[0]['content'] is string -%}
{%- set first_user_prefix = messages[0]['content'] + '
' -%}
{%- else -%}
{%- set first_user_prefix = messages[0]['content'][0]['text'] + '
..... Chat template truncated .....
Unsloth GGUF:hf-to-gguf:Set model quantization version
Unsloth GGUF:gguf.gguf_writer:Writing the following files:
Unsloth GGUF:gguf.gguf_writer:/content/drive/MyDrive/model/gemma3/gemma-3-finetune.F16.gguf: n_tensors = 444, total_size = 7.8G
Unsloth: GGUF conversion: 100%
100/100 [00:42<00:00, 2.83it/s, 7.76G/7.76G]
Unsloth GGUF:hf-to-gguf:Model successfully exported to /content/drive/MyDrive/model/gemma3/
Unsloth: Converted to /content/drive/MyDrive/model/gemma3/gemma-3-finetune.F16.gguf with size = 7.8G
Unsloth: Successfully saved GGUF to:
/content/drive/MyDrive/model/gemma3/gemma-3-finetune.F16.gguf