Support more text models and tasks#36
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
The failure doesn't seem to be relevant and |
addf054 to
f78a118
Compare
6403874 to
d6cefba
Compare
|
Okay, all CIs are green now |
| input_ids=torch.tensor([prompt_token], dtype=torch.long, device=self.device).unsqueeze(0), | ||
| cache_position=torch.tensor([i], dtype=torch.long, device=self.device), | ||
| # Initialize with start token (0 for T5) | ||
| decoder_input_ids = torch.tensor([[0]], dtype=torch.long) |
There was a problem hiding this comment.
You can use self.config. decoder_start_token_id I think to know what is the start token id.
And if you have access to the tokenizer it might be even easier.
There was a problem hiding this comment.
@michaelbenayoun I think I can follow up in a separate PR. There is additional fix I'm working on to make it compatible with latest version of transformers. Will test and squeeze this improvement in the upcoming PR.
There was a problem hiding this comment.
Alright sounds good to me.
|
I just left one comment, otherwise everything looks great! |
Onboarded more popular text models.
Added 7 new models based on the current trends on HF Hub for different tasks:
encoder-onlyarchitecture: bert-family models from Transformers and Hubdecoder-onlyarchitecture: smollm from Hubencoder-decoderarchitecture: T5 form TransformersAdded 2 new tasks:
Improved setup:
pip install .[tests]to install deps required by running modelsImproved CI:
test_modeling_<my_model>.pywhich is mirroring the models and test setup on Transformers. This will help with model code patching/rewriting and testingIntegration points in optimum-executorch:
It's a pain that we have to wait for changes being merged and included in the new release from
Transformersin the upstream. The cycle is long.In this PR, I'm proposing to add the
Transformersintegration points tooptimum-executorchas well, this will immediately unblock adding new models that doesn't require modeling code change fromTransformers(it's true for most of the cases). Of course, we will also continue landing the integration points toTransformersto keep improving the export coverage.A typical dev workflow will look like this:
optimum-executorchandTransformersoptimum-executorchcan be enabled rapidly w/o waiting for the wrapper inTransformersTransformers, switch to use the one fromTransformersand the same wrapper inoptimum-executorchcan be removedTransformerswill ensure no regression on exportability for supported models, reducing the breakage when bump upTransformersversions inoptimum-executorchA specific example:
Seq2SeqLMExportableModulemodule tooptimum-executorchin this PR. In parallel the same module is pending review intransformersExport T5 (encoder-decoder) to ExecuTorch transformers#36486transformersversion inoptimum-executorchand import the one fromtransformersas the source of truth, same as the wrapper for causal lm