Description
When I run the following code:
if True:
model.push_to_hub_gguf(
"my/Repo",
tokenizer,
quantization_method = ["q4_k_m", "q8_0", "q5_k_m",],
token = "hf_myTokenIputHere",
)
- Environment Setup:
- Google Colab
- Environment setup:
%%capture
import os
if "COLAB_" not in "".join(os.environ.keys()):
!pip install unsloth
else:
# Do this only in Colab notebooks! Otherwise use pip install unsloth
!pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
!pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
!pip install --no-deps unsloth
- Dataset Details:
- Dataset Name: Sweaterdog/Andy-4-FT
- Data Preprocessing Steps: ChatML setup, with ShareGPT style dataset
# Load the first dataset and map it to the ShareGPT format
from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template(
tokenizer,
chat_template = "chatml", # Supports zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old, unsloth
mapping = {"role" : "from", "content" : "value", "user" : "human", "assistant" : "gpt"}, # ShareGPT style
map_eos_token = True, # Maps <|im_end|> to </s> instead
)
def formatting_prompts_func(examples):
convos = examples["conversations"]
texts = [tokenizer.apply_chat_template(convo, tokenize = False, add_generation_prompt = False) for convo in convos]
return { "text" : texts, }
pass
from datasets import load_dataset
dataset = load_dataset("Sweaterdog/Andy-4-FT", split = "train")
dataset = dataset.map(formatting_prompts_func, batched = True,)
-
Model Details:
- Model ID:
unsloth/llava-v1.6-mistral-7b-hf-bnb-4bit
- Model Configuration: Only fine tuning text layer. LoRA rank of 32, LoRA alpha of 128
-
Training Configuration:
from transformers import TrainingArguments
from trl import SFTTrainer
from unsloth import is_bfloat16_supported
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
dataset_text_field="text",
max_seq_length=max_seq_length,
dataset_num_proc=6, # Increase CPU processing for faster data loading/preprocessing
packing=False,
args=TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=2,
warmup_ratio = 0.1,
num_train_epochs=0.05,
learning_rate=4e-5,
fp16= not is_bfloat16_supported(),
bf16= is_bfloat16_supported(),
logging_steps= 100,
optim="adamw_8bit",
weight_decay=0.01,
lr_scheduler_type="cosine",
seed=3407,
output_dir = "outputs",
save_steps = 5000,
),
)
-
Reproduction Steps:
-
Expected Behavior:
- Save the model to huggingface, instead of producing an error.
-
Actual Behavior:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-bfda09c104fc> in <cell line: 0>()
1 if True:
----> 2 model.push_to_hub_gguf(
3 "Sweaterdog/llava-GGUF",
4 tokenizer,
5 quantization_method = ["q4_k_m", "q8_0", "q5_k_m",],
/usr/local/lib/python3.11/dist-packages/torch/utils/_contextlib.py in decorate_context(*args, **kwargs)
114 def decorate_context(*args, **kwargs):
115 with ctx_factory():
--> 116 return func(*args, **kwargs)
117
118 return decorate_context
TypeError: save_to_gguf_generic() got an unexpected keyword argument 'quantization_method'
- Additional notes:
- This same code worked in the past, then it didn't, then it did work for a bit, and now it doesn't.
Description
When I run the following code:
Model Details:
unsloth/llava-v1.6-mistral-7b-hf-bnb-4bitTraining Configuration:
SFTConfigReproduction Steps:
Expected Behavior:
Actual Behavior: