When trying to load peft models from a specific revision, unsloth attempts to load the base model with that revision. This leads to the misleading error:
RuntimeError: Unsloth: chreh/tmp_model is not a full model or a PEFT model.
Example reproducibility code is below
# =========
# file1.py
# this file should be run before file2.py
# =========
# run the below to initialize our repository
from unsloth import FastLanguageModel
tinyllama, tinytokenizer = unsloth.FastLanguageModel.from_pretrained("unsloth/tinyllama-bnb-4bit")
# get the peft model
tinyllama = unsloth.FastLanguageModel.get_peft_model(tinyllama)
# save with a revision - important for reproducing the error
tinyllama.push_to_hub("chreh/tmp_model", revision="tinyllama")
# =========
# file2.py
# this file should be run after file1.py
# =========
import unsloth
# errors with the above error
expected_tinyllama, expected_tinytokenizer = unsloth.FastLanguageModel.from_pretrained("chreh/tmp_model", revision="tinyllama")
# =========
# file3.py
# this file should be run after file1.py.
# It demonstrates that unsloth uploads files that can be downloaded by PeftModel but not by unsloth's FastLanguageModel
# =========
from transformers import AutoModelForCausalLM
from peft import PeftModelForCausalLM
expected_tinyllama = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
expected_tinyllama = PeftModelForCausalLM.from_pretrained(expected_tinyllama, "chreh/tmp_model", revision="tinyllama")
error source
For a better look at what causes the error, if we change file1.py to the below
# =========
# file1.py
# this file should be run before file2.py
# =========
# run the below to initialize our repository
from unsloth import FastLanguageModel
llama, tokenizer = unsloth.FastLangaugeModel.from_pretrained("unsloth/llama-3-8b-bnb-4bit")
llama = unsloth.FastLanguageModel.get_peft_model(llama)
llama.push_to_hub("chreh/tmp_model")
tinyllama, tinytokenizer = unsloth.FastLanguageModel.from_pretrained("unsloth/tinyllama-bnb-4bit")
# get the peft model
tinyllama = unsloth.FastLanguageModel.get_peft_model(tinyllama)
# save with a revision - important for reproducing the error
tinyllama.push_to_hub("chreh/tmp_model", revision="tinyllama")
Then, after running file2.py the error we get is a longer error (a mix of RevisionNotFound errors and OSErrors) ending with
tinyllama is not a valid git identifier (branch name, tag name or commit id) that exists for this model name. Check the model page at 'https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit' for available revisions.
This suggests that the unsloth is using the revision parameter during the loading of the base model and not the peft weights.
It probably stems from the below lines.
|
# First check if it's a normal model via AutoConfig |
|
is_peft = False |
|
try: |
|
model_config = AutoConfig.from_pretrained(model_name, token = token) |
|
is_peft = False |
|
except: |
|
try: |
|
# Most likely a PEFT model |
|
peft_config = PeftConfig.from_pretrained(model_name, token = token) |
|
except: |
|
raise RuntimeError(f"Unsloth: `{model_name}` is not a full model or a PEFT model.") |
|
|
When trying to load peft models from a specific revision, unsloth attempts to load the base model with that revision. This leads to the misleading error:
RuntimeError: Unsloth:chreh/tmp_modelis not a full model or a PEFT model.Example reproducibility code is below
error source
For a better look at what causes the error, if we change
file1.pyto the belowThen, after running
file2.pythe error we get is a longer error (a mix of RevisionNotFound errors and OSErrors) ending withtinyllama is not a valid git identifier (branch name, tag name or commit id) that exists for this model name. Check the model page at 'https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit' for available revisions.This suggests that the unsloth is using the revision parameter during the loading of the base model and not the peft weights.
It probably stems from the below lines.
unsloth/unsloth/models/loader.py
Lines 95 to 106 in 8a9e24e