Skip to content

RuntimeError: Unsloth: Quantization failed #835

@kolomichyk

Description

@kolomichyk

Hi!
I'm not using Kaggle, but how can I get that exception?
I'm just trying to save gguf model.

model.save_pretrained_gguf("model", tokenizer, quantization_method = "f16")

RuntimeError: Unsloth: Quantization failed for ./model/unsloth.F16.gguf
You are in a Kaggle environment, which might be the reason this is failing.
Kaggle only provides 20GB of disk space. Merging to 16bit for 7b models use 16GB of space.
This means using model.{save_pretrained/push_to_hub}_merged works, but
`model.{save_pretrained/push_to_hub}_gguf will use too much disk space.
I suggest you to save the 16bit model first, then use manual llama.cpp conversion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions