Skip to content

Support more text models and tasks#36

Merged
michaelbenayoun merged 2 commits intohuggingface:mainfrom
guangy10:text_models
Mar 25, 2025
Merged

Support more text models and tasks#36
michaelbenayoun merged 2 commits intohuggingface:mainfrom
guangy10:text_models

Conversation

@guangy10
Copy link
Collaborator

@guangy10 guangy10 commented Mar 19, 2025

Onboarded more popular text models.

Added 7 new models based on the current trends on HF Hub for different tasks:

  • encoder-only architecture: bert-family models from Transformers and Hub
  • decoder-only architecture: smollm from Hub
  • encoder-decoder architecture: T5 form Transformers

Added 2 new tasks:

  • seq2seq (e.g. translation, summarization, etc) test generation
  • mask prediction

Improved setup:

  • Updated installation to pip install .[tests] to install deps required by running models
  • Updated supported model list

Improved CI:

  • Consolidated CI jobs. One test, per model, i.e. test_modeling_<my_model>.py which is mirroring the models and test setup on Transformers. This will help with model code patching/rewriting and testing
  • CI matrix to cover testing against released executorch from pypi and the nightly version
  • Pin to specific transformers version for better stability
  • Make CI jobs easy to read and debug

Integration points in optimum-executorch:

It's a pain that we have to wait for changes being merged and included in the new release from Transformers in the upstream. The cycle is long.
In this PR, I'm proposing to add the Transformers integration points to optimum-executorch as well, this will immediately unblock adding new models that doesn't require modeling code change from Transformers (it's true for most of the cases). Of course, we will also continue landing the integration points to Transformers to keep improving the export coverage.

A typical dev workflow will look like this:

  1. Add new export wrapper module to both optimum-executorch and Transformers
  2. New models in optimum-executorch can be enabled rapidly w/o waiting for the wrapper in Transformers
  3. Once the export wrapper module is available in Transformers, switch to use the one from Transformers and the same wrapper in optimum-executorch can be removed
  4. The export wrapper module in Transformers will ensure no regression on exportability for supported models, reducing the breakage when bump up Transformers versions in optimum-executorch

A specific example:

  1. I'm adding Seq2SeqLMExportableModule module to optimum-executorch in this PR. In parallel the same module is pending review in transformers Export T5 (encoder-decoder) to ExecuTorch transformers#36486
  2. We can safely merge this PR with new models onboarded
  3. Once #36486 is picked in new releases, we can bump the transformers version in optimum-executorch and import the one from transformers as the source of truth, same as the wrapper for causal lm

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@guangy10 guangy10 marked this pull request as ready for review March 19, 2025 22:25
@guangy10
Copy link
Collaborator Author

/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

The failure doesn't seem to be relevant and EuroBERT can actually work with executorch==0.4.0 as shown on CI from a different run here: https://github.com/huggingface/optimum-executorch/actions/runs/13954658002/job/39062678395.

@guangy10 guangy10 force-pushed the text_models branch 3 times, most recently from addf054 to f78a118 Compare March 20, 2025 03:00
@guangy10 guangy10 self-assigned this Mar 20, 2025
@guangy10 guangy10 force-pushed the text_models branch 5 times, most recently from 6403874 to d6cefba Compare March 22, 2025 00:31
@guangy10
Copy link
Collaborator Author

Okay, all CIs are green now

input_ids=torch.tensor([prompt_token], dtype=torch.long, device=self.device).unsqueeze(0),
cache_position=torch.tensor([i], dtype=torch.long, device=self.device),
# Initialize with start token (0 for T5)
decoder_input_ids = torch.tensor([[0]], dtype=torch.long)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use self.config. decoder_start_token_id I think to know what is the start token id.
And if you have access to the tokenizer it might be even easier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelbenayoun I think I can follow up in a separate PR. There is additional fix I'm working on to make it compatible with latest version of transformers. Will test and squeeze this improvement in the upcoming PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright sounds good to me.

@michaelbenayoun
Copy link
Member

I just left one comment, otherwise everything looks great!

@michaelbenayoun michaelbenayoun merged commit 1907349 into huggingface:main Mar 25, 2025
68 checks passed
@guangy10 guangy10 mentioned this pull request Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants