Skip to content

GGUF breaks - llama-3 #430

@danielhanchen

Description

@danielhanchen

Findings from ggml-org/llama.cpp#7062 and Discord chats:
Notebook for repro: https://colab.research.google.com/drive/1djwQGbEJtUEZo_OuqzN_JF6xSOUKhm4q?usp=sharing

  1. Unsloth + float16 + QLoRA = WORKS
  2. Unsloth + bfloat16 + QLoRA = WORKS
  3. Unsloth + bfloat16 + LoRA = WORKS
  4. Unsloth + float16 + QLoRA + GGUF-f16 = FAILS
  5. Unsloth + bfloat16 + LoRA + GGUF-f16 = FAILS

Todo:

  • HF directly + float16 + QLoRA + GGUF-f16
  • HF directly + float16 + LoRA + GGUF-f16

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions