Error saving GGUF of Gemma27B (but not Gemma4B) on DGX Spark

After successful vision finetuning of vision model Gemma27B (4bit) I run into this error. The process utilizes only approximately 65 GB of the available 128 GB of unified RAM. This error does not occur when I finetune the smaller Gemma4B (4bit) with the same vision dataset.

I am grateful for any advice

> {'loss': 0.0248, 'grad_norm': 0.3881801664829254, 'learning_rate': 8.695652173913045e-09, 'epoch': 20.0}                                        
> {'train_runtime': 196532.9404, 'train_samples_per_second': 0.192, 'train_steps_per_second': 0.006, 'train_loss': 0.07668430322393154, 'epoch': 20.0}
> 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 1200/1200 [54:35:32<00:00, 163.78s/it]
> Unsloth: ##### The current model auto adds a BOS token.
> Unsloth: ##### Your chat template has a BOS token. We shall remove it temporarily.
> Unsloth: Merging model weights to 16-bit format...
> Detected local model directory: /workspace/AIEngine/medgemma-27b-it
> Copied tokenizer.model from local model directory
> Found HuggingFace hub cache directory: /root/.cache/huggingface/hub
> Unsloth: Preparing safetensor model files:   0%|                                                                         | 0/12 [00:00<?, ?it/s]Copied model-00003-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:   8%|█████▍                                                           | 1/12 [00:02<00:22,  2.02s/it]Copied model-00006-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  17%|██████████▊                                                      | 2/12 [00:04<00:25,  2.52s/it]Copied model-00012-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  25%|████████████████▎                                                | 3/12 [00:05<00:13,  1.45s/it]Copied model-00009-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  33%|█████████████████████▋                                           | 4/12 [00:06<00:12,  1.62s/it]Copied model-00002-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  42%|███████████████████████████                                      | 5/12 [00:08<00:12,  1.76s/it]Copied model-00007-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  50%|████████████████████████████████▌                                | 6/12 [00:10<00:10,  1.82s/it]Copied model-00010-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  58%|█████████████████████████████████████▉                           | 7/12 [00:13<00:09,  1.96s/it]Copied model-00008-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  67%|███████████████████████████████████████████▎                     | 8/12 [00:15<00:08,  2.00s/it]Copied model-00004-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  75%|████████████████████████████████████████████████▊                | 9/12 [00:17<00:06,  2.00s/it]Copied model-00001-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  83%|█████████████████████████████████████████████████████▎          | 10/12 [00:21<00:05,  2.60s/it]Copied model-00011-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files:  92%|██████████████████████████████████████████████████████████▋     | 11/12 [00:23<00:02,  2.45s/it]Copied model-00005-of-00012.safetensors from local model directory
> Unsloth: Preparing safetensor model files: 100%|████████████████████████████████████████████████████████████████| 12/12 [00:25<00:00,  2.10s/it]
> Unsloth: Merging weights into 16bit: 100%|██████████████████████████████████████████████████████████████████████| 12/12 [07:34<00:00, 37.89s/it]
> Unsloth: Merge process complete. Saved to `/home/ollam3/unsloth_finetune`
> Unsloth: Converting to GGUF format...
> ==((====))==  Unsloth: Conversion from HF to GGUF information
>    \\   /|    [0] Installing llama.cpp might take 3 minutes.
> O^O/ \_/ \    [1] Converting HF to GGUF bf16 might take 3 minutes.
> \        /    [2] Converting GGUF bf16 to ['q4_k_m'] might take 10 minutes each.
>  "-____-"     In total, you will have to wait at least 16 minutes.
> 
> Unsloth: llama.cpp found in the system. Skipping installation.
> Unsloth: Preparing converter script...
> Unsloth: [1] Converting model into bf16 GGUF format.
> This might take 3 minutes...
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.12/dist-packages/unsloth_zoo/llama_cpp.py", line 991, in convert_to_gguf
>     subprocess.run(command, shell=True, check=True, capture_output=True)
>   File "/usr/lib/python3.12/subprocess.py", line 571, in run
>     raise CalledProcessError(retcode, process.args,
> subprocess.CalledProcessError: Command 'python llama.cpp/unsloth_convert_hf_to_gguf.py --outfile medgemma-27b-it.BF16.gguf --outtype bf16 --split-max-size 50G unsloth_finetune' returned non-zero exit status 1.
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.12/dist-packages/unsloth/save.py", line 1835, in unsloth_save_pretrained_gguf
>     all_file_locations, want_full_precision, is_vlm_update = save_to_gguf(
>                                                              ^^^^^^^^^^^^^
>   File "/usr/local/lib/python3.12/dist-packages/unsloth/save.py", line 1099, in save_to_gguf
>     initial_files, is_vlm_update = convert_to_gguf(
>                                    ^^^^^^^^^^^^^^^^
>   File "/usr/local/lib/python3.12/dist-packages/unsloth_zoo/llama_cpp.py", line 995, in convert_to_gguf
>     raise RuntimeError(f"Unsloth: Failed to convert {description} to GGUF: {e}")
> RuntimeError: Unsloth: Failed to convert text model to GGUF: Command 'python llama.cpp/unsloth_convert_hf_to_gguf.py --outfile medgemma-27b-it.BF16.gguf --outtype bf16 --split-max-size 50G unsloth_finetune' returned non-zero exit status 1.
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/home/ollam3/finetunevisionGemma3_Herz.py", line 217, in <module>
>     model.save_pretrained_gguf("unsloth_finetune", tokenizer, quantization_method = "q4_k_m")
>   File "/usr/local/lib/python3.12/dist-packages/unsloth/save.py", line 1855, in unsloth_save_pretrained_gguf
>     raise RuntimeError(f"Unsloth: GGUF conversion failed: {e}")
> RuntimeError: Unsloth: GGUF conversion failed: Unsloth: Failed to convert text model to GGUF: Command 'python llama.cpp/unsloth_convert_hf_to_gguf.py --outfile medgemma-27b-it.BF16.gguf --outtype bf16 --split-max-size 50G unsloth_finetune' returned non-zero exit status 1.
> 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error saving GGUF of Gemma27B (but not Gemma4B) on DGX Spark #3581

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Error saving GGUF of Gemma27B (but not Gemma4B) on DGX Spark #3581

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions