Skip to content

Unsloth: config.json does not exist inside Gemma-3 #2098

@JdoubleU2

Description

@JdoubleU2

Im having problems saving GGUFs of Gemma 3 finetunes. I was having this problem on my container environment and assumed I was having issues while training that caused other files to be generated but I am not. I have an almost identical Jypiter notebook as the one on the blog, yet I still cannot get model.save_pretrained_gguf() to run successfully with Gemma 3 models. In fact, it doesn't even work on the current version of the notebooks on the blog!

https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B).ipynb#scrollTo=FqfebeAdT073

Image

I have tried multiple methods of generating the config.json files

model.save_pretrained("Careconnect-gemma-3-4b-it")
tokenizer.save_pretrained("Careconnect-gemma-3-4b-it")
model.config.save_pretrained("Careconnect-gemma-3-4b-it")  # Try to ensure config is saved config is saved

from transformers import AutoConfig
# Force auto generate because model.config.save_pretrained() isn't working
config = AutoConfig.from_pretrained("Careconnect-gemma-3-4b-it") 
config.save_pretrained("Careconnect-gemma-3-4b-it")

Doing this, I was able to generate a config.json. However, an issue which I believe stems from limitations to external internet access on my container, still will be unable to run the model.save_pretrained_gguf(). Supposedly due to internet access, possibly not related to the config issue.

Unsloth: Updating system package directories  
Unsloth: Install GGUF and other packages
RuntimeError: Unsloth: Could not obtain https://github.com/ggerganov/llama.cpp/raw/refs/heads/master/convert_hf_to_gguf.py. Maybe you don't have internet ocnnection?
Traceback:
File "Cell [cell17]", line 1, in <module>
    model.save_pretrained_gguf(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/unsloth/save.py", line 2246, in save_to_gguf_generic
    metadata = _convert_to_gguf(
File "/opt/conda/lib/python3.10/site-packages/unsloth_zoo/llama_cpp.py", line 653, in convert_to_gguf
    conversion_filename, supported_types = _download_convert_hf_to_gguf()
File "/opt/conda/lib/python3.10/site-packages/unsloth_zoo/llama_cpp.py", line 353, in _download_convert_hf_to_gguf
    raise RuntimeError(

As a work around I simply run the llama.cpp convert_hf_to_gguf.py python script and get this output.

python convert_hf_to_gguf.py Careconnect-gemma-3-4b-it --outtype q8_0 --outfile ./gguf_model


INFO:hf-to-gguf:Loading model: Careconnect-gemma-3-4b-it  
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only  
INFO:hf-to-gguf:Has vision encoder, but it will be ignored  
INFO:hf-to-gguf:Exporting model...  
INFO:hf-to-gguf:Set meta model  
INFO:hf-to-gguf:Set model parameters  
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type bos to 2  
INFO:gguf.vocab:Setting special token type eos to 106  
INFO:gguf.vocab:Setting special token type unk to 3  
INFO:gguf.vocab:Setting special token type pad to 0  
INFO:gguf.vocab:Setting add_bos_token to True  
INFO:gguf.vocab:Setting add_eos_token to False  
INFO:gguf.vocab:Setting chat_template to {{ bos_token }}  
{%- if messages[0]['role'] == 'system' -%}  
    {%- if messages[0]['content'] is string -%}  
        {%- set first_user_prefix = messages[0]['content'] + '  

' -%}  
    {%- else -%}  
        {%- set first_user_prefix = messages[0]['content'][0]['text'] + '  

' -%}  
    {%- endif -%}  
    {%- set loop_messages = messages[1:] -%}  
{%- else -%}  
    {%- set first_user_prefix = "" -%}  
    {%- set loop_messages = messages -%}  
{%- endif -%}  
{%- for message in loop_messages -%}  
    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}  
        {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}  
    {%- endif -%}  
    {%- if (message['role'] == 'assistant') -%}  
        {%- set role = "model" -%}  
    {%- else -%}  
        {%- set role = message['role'] -%}  
    {%- endif -%}  
    {{ '<start_of_turn>' + role + '  
' + (first_user_prefix if loop.first else "") }}  
    {%- if message['content'] is string -%}  
        {{ message['content'] | trim }}  
    {%- elif message['content'] is iterable -%}  
        {%- for item in message['content'] -%}  
            {%- if item['type'] == 'image' -%}  
                {{ '<start_of_image>' }}  
            {%- elif item['type'] == 'text' -%}  
                {{ item['text'] | trim }}  
            {%- endif -%}  
        {%- endfor -%}  
    {%- else -%}  
        {{ raise_exception("Invalid content type") }}  
    {%- endif -%}  
    {{ '<end_of_turn>  
' }}  
{%- endfor -%}  
{%- if add_generation_prompt -%}  
    {{'<start_of_turn>model  
'}}  
{%- endif -%}  

INFO:hf-to-gguf:Set model quantization version  
INFO:gguf.gguf_writer:Writing the following files:  
INFO:gguf.gguf_writer:gguf_model: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]  
INFO:hf-to-gguf:NOTE: this script only convert the language model to GGUF  
INFO:hf-to-gguf:      for the vision model, please use gemma3_convert_encoder_to_gguf.py  
INFO:hf-to-gguf:Model successfully exported to gguf_model

Writing 0.00bytes? 0:00 seconds? This cannot be right! The generated .gguf file is also only 6.3MB!

-rw-r--r--. 1 root root 6.3M Mar 19 01:54 /home/app/Careconnect-gemma-3-4b-it.gguf

This could be a problem with llama.cpp Gemma 3 GGUF and not Unsloth. But considering the issue is even appearing on the Colab Notebooks I believe this issue should be looked into by more people / devs. I see that Unsloth does have Hugging Face repositories with the GGUF files, so there should be a workaround.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions