Skip to content

[docs] Model merging#1423

Merged
stevhliu merged 6 commits intohuggingface:mainfrom
stevhliu:model-merging
Feb 15, 2024
Merged

[docs] Model merging#1423
stevhliu merged 6 commits intohuggingface:mainfrom
stevhliu:model-merging

Conversation

@stevhliu
Copy link
Copy Markdown
Member

@stevhliu stevhliu commented Jan 31, 2024

A guide to new model merging methods introduced in #1364.

todo:

  • add API reference for merging utilities (once the other PR is merged, I'll rerun the build_pr_documentation test and it should pass)
  • test and run code examples

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread docs/source/developer_guides/model_merging.md Outdated
@pacman100 pacman100 mentioned this pull request Feb 1, 2024
3 tasks
Comment thread docs/source/developer_guides/model_merging.md
Comment thread docs/source/developer_guides/model_merging.md
@prateeky2806
Copy link
Copy Markdown
Contributor

Hi, is there an estimated timeline about by when this would be merged ?

@stevhliu
Copy link
Copy Markdown
Member Author

Hi, is there an estimated timeline about by when this would be merged ?

Hi, it should be ready in the next few days and at the end of the week by the latest if there are no major issues!

@stevhliu stevhliu marked this pull request as ready for review February 12, 2024 18:23
Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment thread docs/source/developer_guides/model_merging.md Outdated
adapters = ["norobots", "adcopy", "sql"]
weights = [2.0, 0.3, 0.7]
adapter_name = "merge"
density = 0.2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how this density parameter works in dare_ties, I am assuming it is used to keep 20% params and then rescale as in DARE. However, I am not sure if we do TIES on top of it then will it again only keep the 20% params of the pruned and rescaled checkpoint essentially leading to 0.2*0.2 *100 = 4% remaining parameter or if it will keep 20% of the parameters. This is not a comment on the documentation but this behaviour is not very clear.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, in dare_ties, first random pruning happens based on density followed by scaling. After this, the majority_sign_mask and disjoint_merge are performed similar to the ties method. So, pruning is taken from dare which is random and rescaled followed by majority sign and disjoint merge from ties.

Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment on lines +49 to +55
config = PeftConfig.from_pretrained("smangrul/tinyllama_lora_norobots")
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("smangrul/tinyllama_lora_norobots")

model = PeftModel.from_pretrained(model, "smangrul/tinyllama_lora_norobots", adapter_name="norobots")
_ = model.load_adapter("smangrul/tinyllama_lora_sql", adapter_name="sql")
_ = model.load_adapter("smangrul/tinyllama_lora_adcopy", adapter_name="adcopy")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pacman100 @BenjaminBossan makes sense to have copies of these checkpoints under the PEFT testing org?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's try to put those artifacts on https://huggingface.co/peft-internal-testing.

Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment thread docs/source/package_reference/merge_utils.md Outdated
Copy link
Copy Markdown
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking solid!

Copy link
Copy Markdown
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done. On top of what has already been mentioned, I just have one comment about there actually being more than 2 methods. Otherwise, this LGTM.

Comment thread docs/source/developer_guides/model_merging.md Outdated
Comment thread docs/source/developer_guides/model_merging.md Outdated
@stevhliu stevhliu merged commit cde8f1a into huggingface:main Feb 15, 2024
@stevhliu stevhliu deleted the model-merging branch February 15, 2024 16:13
BenjaminBossan pushed a commit to BenjaminBossan/peft that referenced this pull request Mar 14, 2024
* content

* code snippets

* api reference

* update

* feedback

* feedback
Guy-Bilitski pushed a commit to Guy-Bilitski/peft that referenced this pull request May 13, 2025
* content

* code snippets

* api reference

* update

* feedback

* feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants