Skip to content

[BUG] GGUF files created from Gemma 3 models lose the 'vision' capability #2290

@Mano-Wii

Description

@Mano-Wii
  1. Bug Description
    When using the colab notebook for Gemma3_(4B), notice that the model in the .gguf file, when used in ollama, does not have the 'vision' capability.

  2. Reproduction Steps:

  • Get model and tokenizer from unsloth:
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name = "unsloth/gemma-3-4b-it",
    max_seq_length = 2048, # Choose any for long context!
    load_in_4bit = True,  # 4 bit quantization to reduce memory
    load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
    full_finetuning = False, # [NEW!] We have full finetuning now!
    # token = "hf_...", # use one if using gated models
)

  • Save LoRA adapters (Pay attention to the directory, as you need to use the same directory to save the gguf file)
model.save_pretrained("mano-wii/gemma-3-finetune")  # Local saving
tokenizer.save_pretrained("mano-wii/gemma-3-finetune")

  • Save to GGUF
model.save_pretrained_gguf(
    "mano-wii/gemma-3-finetune",
    quantization_type = "Q8_0", # For now only Q8_0, BF16, F16 supported
)

  • Optionally send to huggingface:
from unsloth_zoo.saving_utils import prepare_saving
repo_id = "mano-wii/gemma-3-finetune-gguf"
prepare_saving(
    model,
    repo_id,
    push_to_hub = True,
    max_shard_size = "50GB",
    private = False,
    token = hf_token,
)

from huggingface_hub import HfApi
api = HfApi(token = hf_token)
api.upload_folder(
    folder_path = "mano-wii",
    repo_id = repo_id,
    repo_type = "model",
    allow_patterns = ["*.gguf"],
)
  1. Expected Behavior:
    When using the ollama show hf.co/mano-wii/gemma-3-finetune-gguf command, vision should be in Capabilities (as gemma3):
PS D:\> ollama show gemma3
  Model
    architecture        gemma3
    parameters          4.3B
    context length      8192
    embedding length    2560
    quantization        Q4_K_M

  Capabilities
    completion
    vision

  Parameters
    stop           "<end_of_turn>"
    temperature    0.1
  1. Actual Behavior:
    Note no vision in Capabilities when using ollama show hf.co/mano-wii/gemma-3-finetune-gguf:
PS D:\> ollama show hf.co/mano-wii/gemma-3-finetune-gguf
  Model
    architecture        gemma3
    parameters          3.9B
    context length      131072
    embedding length    2560
    quantization        unknown

  Capabilities
    completion

  Parameters
    stop           "<end_of_turn>"
    temperature    0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions