Describe the bug
Unsloth cannot convert fine-tuned model based on unsloth/phi-4 to GGUF because embedded llama.cpp does not support the architecture LlamaModel which was embedded in phi-4 by unsloth/phi-4 as a bug fix.
- Environment Setup:
- OS: [e.g., Ubuntu 2.04]
- Python Version: [e.g., 3.10]
- Frameworks/Libraries: unsloth
colab / script - was this run in colab or as a script: trying both and same result llama.cpp error (no support for LLamaModel.
2/3. Model Details:
- Model ID: unsloth/phi-4
- Model Configuration: [e.g., lora params, quantization, etc.]
-
Training Configuration:
- Trainer Args: Not Applicable
-
Reproduction Steps:
- Minimal script to reproduce error:
model.save_pretrained_gguf(
"phi-4-finetune",
quantization_type = "Q8_0",
)
-
Expected Behavior:
-
Actual Behavior:
- llama.cpp used for conversion in the script fails to convert the phi-4 fine-tune as the morphed architecture is not supported by llama.cpp
Describe the bug
Unsloth cannot convert fine-tuned model based on unsloth/phi-4 to GGUF because embedded llama.cpp does not support the architecture LlamaModel which was embedded in phi-4 by unsloth/phi-4 as a bug fix.
colab/ script - was this run incolabor as a script: trying both and same result llama.cpp error (no support for LLamaModel.2/3. Model Details:
Training Configuration:
Reproduction Steps:
model.save_pretrained_gguf(
"phi-4-finetune",
quantization_type = "Q8_0",
)
Expected Behavior:
Actual Behavior: