Skip to content

[core] PEFT integration#24827

Closed
younesbelkada wants to merge 23 commits intohuggingface:mainfrom
younesbelkada:peft-integration
Closed

[core] PEFT integration#24827
younesbelkada wants to merge 23 commits intohuggingface:mainfrom
younesbelkada:peft-integration

Conversation

@younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Jul 14, 2023

What does this PR do?

This PR is an attempt to tightly integrate PEFT library with transformers, by offering users the ability to load PEFT models out of the box from AutoModelForxxx.from_pretrained() if the local directory or the Hub model id contains adapter weights and adapter config.

import tempfile
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import AutoPeftModelForCausalLM

peft_model_id = "ybelkada/opt-350m-lora"
model = AutoModelForCausalLM.from_pretrained(peft_model_id)

with tempfile.TemporaryDirectory() as tmpdirname:
    peft_model = AutoPeftModelForCausalLM.from_pretrained(peft_model_id)
    peft_model.save_pretrained(tmpdirname)

    model = AutoModelForCausalLM.from_pretrained(tmpdirname)
    print(model)

Although this is similar to what have been introduced in huggingface/peft#694 this PR offers a direct integration with transformers

TODOs:

  •  handle PeftModel.from_pretrained(xxx) kwargs
  • tests
  • docs (with the help of @stevhliu )

cc @sgugger @pacman100 @BenjaminBossan

@younesbelkada
Copy link
Contributor Author

younesbelkada commented Jul 14, 2023

I would like to have first review of the draft if possible, to see if we are inline with the approach 🙏 @sgugger - Thanks !

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, left some initial comments!


is_adapter_file = False
if is_peft_available():
is_adapter_file = peft_adapter_model_id is not None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe raise an error if someone wants to load a model with an adapter file and peft is not installed? I don't know how common that case is, but it seems to me that the user would want to use the adapter file if it's there no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great idea, and I am keen to add that condition as well. However, I am worried about something, this means we'll need to look for the adapter_config.json file on the local dir or remote directory in any case and I want to avoid that and restrict only to users that have PEFT installed. What do you think? If you think that this is fine, happy to add it.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @younesbelkada for integration of PEFT deeply into transformers 🚀! A concern that I have now is that there are various ways of loading PEFT models now, i.e., PeftModel.from_pretrained(model, peft_model_id), AutoPeftModelForCausalLM.from_pretrained(peft_model_id) and now AutoModelForCausalLM.from_pretrained(peft_model_id); this may lead to confusion among end-users. We need to clearly document the preferred/recommended way. And this needs to be updated on the inference widgets too. WDTY?

@younesbelkada
Copy link
Contributor Author

Thank you for your review @pacman100 !
I think the canonical way to load PEFT models for inference would still be to use PEFT classes (i.e. either AutoPeftModelForCausalLM or PeftModel - btw we should encourage users to use more and more AutoPeftModelForCausalLM instead of PeftModel)
This PR is intended to make things even easier for users and for further integrations with HF ecosystem (pipeline, diffusers) and it will be clearly documented. I also think we should update the inference widgets after the PEFT release.

@younesbelkada younesbelkada requested a review from sgugger July 17, 2023 09:37
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple more comments!


@require_peft
@require_torch
@slow
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add one non-slow test with a small fake model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be done, also added a new job to run peft tests thanks to @ydshieh

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm that's not the correct way to add a new peft job as it will always run on any PR, even if it shouldn't be run.

@ydshieh
Copy link
Collaborator

ydshieh commented Jul 19, 2023

Mmm that's not the correct way to add a new peft job as it will always run on any PR, even if it shouldn't be run.

@younesbelkada

You can take a look

examples_tests_to_run = [f for f in test_files_to_run if f.startswith("examples")]

(check examples_test_list.txt and examples_tests_to_run in the same file and the 2 CircleCI config files)
(you have to check against peft_integration in your case)

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jul 19, 2023

The design as is would make it hard for diffusers to leverage transformes to load PEFT weights for transformers models.
In diffusers we have the following workflow:

from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
# now we have a loaded CLIPTextModel under `pipe.text_encoder`

pipe.load_lora(...)
# doing this means we would want to call `text_encoder.load_adapter(...)` under the hood

This means we necessarily need transformers to support the ability to load lora weights into already instantiated models, just like MMS allows it via load_adapter - see: https://huggingface.co/docs/transformers/v4.31.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForCTC.load_adapter.example

[Edit] We could come around it, by wrapping pipe.text_encoder into a PeftModel under the hood when doing pipe.load_lora(...) to transform CLIPTextModel into PeftModel but that:

  • would break some internal diffusers code
  • force us to wrap logic around transformers, e.g. we could just do pipe.text_encoder.load_adapter(...), but would have to first wrap every transformers model into a PEFT model

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jul 19, 2023

From purely a transformers point of view, I would also struggle a bit with the following:

1.) PEFT weights seemingly should only be loaded with AutoModel, which is restrictive as there is no need to go over the AutoModel class if one knows the model.

It does look like the following would be possible:

from transformers import LlamaForCausalLM

model =  LlamaForCausalLM.from_pretrained("tloen/alpaca-lora-7b")

but then model.__class__ is of type PeftModel which would be confusing to me - I used a class method of LlamaForCausalLM

2.) I don't like that .from_pretrained(<peft/model/id>) more or less fully dispatches to the peft library instead of staying in Transformers' land. I imagined peft to be used as a utility library, not Transformers to dispatch to peft.

=> Could we not create a PeftModelMixin class so that peft operates more under the hood?

@younesbelkada
Copy link
Contributor Author

Closing as #25077 got merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants