I'm encountering an error while trying to save the default unsloth Llama 3.1 model to GGUF format. The issue occurs when running the code on Google Colab with a T4 GPU.
Environment:
- Google Colab
- T4 GPU
- Default unsloth Llama 3.1 code
Changes made:
- Updated API key
- Updated Hugging Face username
Code snippet:
# Save to q4_k_m GGUF
if True: model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")
if True: model.push_to_hub_gguf("chukypedro/testing", tokenizer, quantization_method = "q4_k_m", token = "api_key")
# Save to multiple GGUF options - much faster if you want multiple!
if True:
model.push_to_hub_gguf(
"myname/testing", # Change hf to your username!
tokenizer,
quantization_method = ["q4_k_m"],
token = "", # Get a token at https://huggingface.co/settings/tokens
Error traceback:
RuntimeError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/torch/serialization.py](https://localhost:8080/#) in __exit__(self, *args)
497
498 def __exit__(self, *args) -> None:
--> 499 self.file_like.write_end_of_file()
500 if self.file_stream is not None:
501 self.file_stream.close()
RuntimeError: [enforce fail at inline_container.cc:603] . unexpected pos 576 vs 470
Expected behavior:
The model should save successfully to GGUF format without any errors.
Actual behavior:
The saving process fails with a RuntimeError, indicating an unexpected position in the file.
I'm encountering an error while trying to save the default unsloth Llama 3.1 model to GGUF format. The issue occurs when running the code on Google Colab with a T4 GPU.
Environment:
Changes made:
Code snippet:
Error traceback:
Expected behavior:
The model should save successfully to GGUF format without any errors.
Actual behavior:
The saving process fails with a RuntimeError, indicating an unexpected position in the file.