-
Bug Description
When using the colab notebook for Gemma3_(4B), notice that the model in the .gguf file, when used in ollama, does not have the 'vision' capability.
-
Reproduction Steps:
- Get model and tokenizer from unsloth:
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name = "unsloth/gemma-3-4b-it",
max_seq_length = 2048, # Choose any for long context!
load_in_4bit = True, # 4 bit quantization to reduce memory
load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
full_finetuning = False, # [NEW!] We have full finetuning now!
# token = "hf_...", # use one if using gated models
)
- Save LoRA adapters (Pay attention to the directory, as you need to use the same directory to save the gguf file)
model.save_pretrained("mano-wii/gemma-3-finetune") # Local saving
tokenizer.save_pretrained("mano-wii/gemma-3-finetune")
model.save_pretrained_gguf(
"mano-wii/gemma-3-finetune",
quantization_type = "Q8_0", # For now only Q8_0, BF16, F16 supported
)
- Optionally send to huggingface:
from unsloth_zoo.saving_utils import prepare_saving
repo_id = "mano-wii/gemma-3-finetune-gguf"
prepare_saving(
model,
repo_id,
push_to_hub = True,
max_shard_size = "50GB",
private = False,
token = hf_token,
)
from huggingface_hub import HfApi
api = HfApi(token = hf_token)
api.upload_folder(
folder_path = "mano-wii",
repo_id = repo_id,
repo_type = "model",
allow_patterns = ["*.gguf"],
)
- Expected Behavior:
When using the ollama show hf.co/mano-wii/gemma-3-finetune-gguf command, vision should be in Capabilities (as gemma3):
PS D:\> ollama show gemma3
Model
architecture gemma3
parameters 4.3B
context length 8192
embedding length 2560
quantization Q4_K_M
Capabilities
completion
vision
Parameters
stop "<end_of_turn>"
temperature 0.1
- Actual Behavior:
Note no vision in Capabilities when using ollama show hf.co/mano-wii/gemma-3-finetune-gguf:
PS D:\> ollama show hf.co/mano-wii/gemma-3-finetune-gguf
Model
architecture gemma3
parameters 3.9B
context length 131072
embedding length 2560
quantization unknown
Capabilities
completion
Parameters
stop "<end_of_turn>"
temperature 0.1
Bug Description
When using the colab notebook for Gemma3_(4B), notice that the model in the .gguf file, when used in ollama, does not have the 'vision' capability.
Reproduction Steps:
When using the
ollama show hf.co/mano-wii/gemma-3-finetune-ggufcommand,visionshould be inCapabilities(asgemma3):Note no
visioninCapabilitieswhen usingollama show hf.co/mano-wii/gemma-3-finetune-gguf: