Add support for loading checkpoints with newly added tokens.#272
Closed
charlesCXK wants to merge 1 commit intounslothai:nightlyfrom
Closed
Add support for loading checkpoints with newly added tokens.#272charlesCXK wants to merge 1 commit intounslothai:nightlyfrom
charlesCXK wants to merge 1 commit intounslothai:nightlyfrom
Conversation
Contributor
|
Wait would this load the lm_head and embed_tokens matrix correctly? |
Contributor
|
Would it not cause it to be randomnly inited? |
Author
I have tested the code using such a setting:
'''
########################################
Add special tokens to the tokenizer.
########################################
'''
if True:
old_vocab_size = tokenizer.vocab_size
print('old vocab size: ', old_vocab_size)
tokenizer.add_tokens("<NEWTOKEN>", special_tokens=True)
tokenizer.add_tokens("</NEWTOKEN>", special_tokens=True)
# test case
print(tokenizer.tokenize("This is an example with <NEWTOKEN> and </NEWTOKEN> token."))
# We resize the embeddings to avoid index errors.
model.resize_token_embeddings(len(tokenizer))
model.config.vocab_size = len(tokenizer)
# average init the new token embeddings
num_new_tokens = len(tokenizer) - old_vocab_size
print("num_new_tokens:", num_new_tokens)
input_embeddings = model.get_input_embeddings().weight.data
output_embeddings = model.get_output_embeddings().weight.data
input_embeddings_avg = input_embeddings[:-num_new_tokens].mean(
dim=0, keepdim=True)
output_embeddings_avg = output_embeddings[:-num_new_tokens].mean(
dim=0, keepdim=True)
input_embeddings[-num_new_tokens:] = input_embeddings_avg
output_embeddings[-num_new_tokens:] = output_embeddings_avg
# open lm head and input embedding
model.lm_head.weight.requires_grad = True
model.get_input_embeddings().weight.requires_grad = True
save_path = "/home/xxx"
if os.path.exists(save_path):
shutil.rmtree(save_path)
model.save_pretrained(save_path)
print('Use saved model for inference.')
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = save_path, # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
new_token_num = 0,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs = tokenizer(
[
"Continue the fibonnaci sequence. 1, 1, 2, 3, 5, 8"
], return_tensors = "pt").to("cuda")
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
|
|
Hi @charlesCXK, when using this code, I noticed that the loaded model doesn't include the new token that I added before fine-tuning. Do you have to add the new token again for inference? For example, |
Contributor
|
Whoopsies sorry on the horrible delay - I'll review this PR and test it out - so sorry! |
Contributor
|
@charlesCXK @chtmp223 Extreme apologies on the delay - I think I might have fixed it. You need to call from unsloth import add_new_tokens
from unsloth import FastLanguageModel
add_new_tokens(model, tokenizer, ["new_token_1", "new_token_2"])
model = FastLanguageModel.get_peft_model(model, ...) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.