[docs] Model merging#1423
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Hi, is there an estimated timeline about by when this would be merged ? |
cb44c70 to
9a84309
Compare
Hi, it should be ready in the next few days and at the end of the week by the latest if there are no major issues! |
9a84309 to
bde2219
Compare
| adapters = ["norobots", "adcopy", "sql"] | ||
| weights = [2.0, 0.3, 0.7] | ||
| adapter_name = "merge" | ||
| density = 0.2 |
There was a problem hiding this comment.
I am not sure how this density parameter works in dare_ties, I am assuming it is used to keep 20% params and then rescale as in DARE. However, I am not sure if we do TIES on top of it then will it again only keep the 20% params of the pruned and rescaled checkpoint essentially leading to 0.2*0.2 *100 = 4% remaining parameter or if it will keep 20% of the parameters. This is not a comment on the documentation but this behaviour is not very clear.
There was a problem hiding this comment.
Hello, in dare_ties, first random pruning happens based on density followed by scaling. After this, the majority_sign_mask and disjoint_merge are performed similar to the ties method. So, pruning is taken from dare which is random and rescaled followed by majority sign and disjoint merge from ties.
| config = PeftConfig.from_pretrained("smangrul/tinyllama_lora_norobots") | ||
| model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, device_map="auto") | ||
| tokenizer = AutoTokenizer.from_pretrained("smangrul/tinyllama_lora_norobots") | ||
|
|
||
| model = PeftModel.from_pretrained(model, "smangrul/tinyllama_lora_norobots", adapter_name="norobots") | ||
| _ = model.load_adapter("smangrul/tinyllama_lora_sql", adapter_name="sql") | ||
| _ = model.load_adapter("smangrul/tinyllama_lora_adcopy", adapter_name="adcopy") |
There was a problem hiding this comment.
@pacman100 @BenjaminBossan makes sense to have copies of these checkpoints under the PEFT testing org?
There was a problem hiding this comment.
Yes, let's try to put those artifacts on https://huggingface.co/peft-internal-testing.
BenjaminBossan
left a comment
There was a problem hiding this comment.
Nicely done. On top of what has already been mentioned, I just have one comment about there actually being more than 2 methods. Otherwise, this LGTM.
* content * code snippets * api reference * update * feedback * feedback
* content * code snippets * api reference * update * feedback * feedback
A guide to new model merging methods introduced in #1364.
todo:
build_pr_documentationtest and it should pass)