Skip to content

Support loading models from revisions #624

@chrehall68

Description

@chrehall68

When trying to load peft models from a specific revision, unsloth attempts to load the base model with that revision. This leads to the misleading error:

RuntimeError: Unsloth: chreh/tmp_model is not a full model or a PEFT model.

Example reproducibility code is below

# =========
# file1.py
# this file should be run before file2.py
# =========
# run the below to initialize our repository
from unsloth import FastLanguageModel
tinyllama, tinytokenizer = unsloth.FastLanguageModel.from_pretrained("unsloth/tinyllama-bnb-4bit")

# get the peft model
tinyllama = unsloth.FastLanguageModel.get_peft_model(tinyllama)

# save with a revision - important for reproducing the error
tinyllama.push_to_hub("chreh/tmp_model", revision="tinyllama")
# =========
# file2.py
# this file should be run after file1.py
# =========
import unsloth
# errors with the above error
expected_tinyllama, expected_tinytokenizer = unsloth.FastLanguageModel.from_pretrained("chreh/tmp_model", revision="tinyllama")
# =========
# file3.py
# this file should be run after file1.py. 
# It demonstrates that unsloth uploads files that can be downloaded by PeftModel but not by unsloth's FastLanguageModel
# =========
from transformers import AutoModelForCausalLM
from peft import PeftModelForCausalLM


expected_tinyllama = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
expected_tinyllama = PeftModelForCausalLM.from_pretrained(expected_tinyllama, "chreh/tmp_model", revision="tinyllama")

error source

For a better look at what causes the error, if we change file1.py to the below

# =========
# file1.py
# this file should be run before file2.py
# =========
# run the below to initialize our repository
from unsloth import FastLanguageModel
llama, tokenizer = unsloth.FastLangaugeModel.from_pretrained("unsloth/llama-3-8b-bnb-4bit")
llama = unsloth.FastLanguageModel.get_peft_model(llama)
llama.push_to_hub("chreh/tmp_model")

tinyllama, tinytokenizer = unsloth.FastLanguageModel.from_pretrained("unsloth/tinyllama-bnb-4bit")

# get the peft model
tinyllama = unsloth.FastLanguageModel.get_peft_model(tinyllama)

# save with a revision - important for reproducing the error
tinyllama.push_to_hub("chreh/tmp_model", revision="tinyllama")

Then, after running file2.py the error we get is a longer error (a mix of RevisionNotFound errors and OSErrors) ending with
tinyllama is not a valid git identifier (branch name, tag name or commit id) that exists for this model name. Check the model page at 'https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit' for available revisions.

This suggests that the unsloth is using the revision parameter during the loading of the base model and not the peft weights.

It probably stems from the below lines.

# First check if it's a normal model via AutoConfig
is_peft = False
try:
model_config = AutoConfig.from_pretrained(model_name, token = token)
is_peft = False
except:
try:
# Most likely a PEFT model
peft_config = PeftConfig.from_pretrained(model_name, token = token)
except:
raise RuntimeError(f"Unsloth: `{model_name}` is not a full model or a PEFT model.")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions