Skip to content

Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch#20801

Merged
ArthurZucker merged 4 commits into
huggingface:mainfrom
bastings:t5x_to_pytorch
Dec 23, 2022
Merged

Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch#20801
ArthurZucker merged 4 commits into
huggingface:mainfrom
bastings:t5x_to_pytorch

Conversation

@bastings

@bastings bastings commented Dec 16, 2022

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds a script that can convert Google T5X (Flax) T5 and T5-v1.1 checkpoints into PyTorch checkpoints.
This allows users to convert non-standard checkpoints that have been trained with T5X and use them with the Transformers library in PyTorch.

Usage:

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case. Discussed with @thomwolf .
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests? The code is tested but not part of this PR, since the test requires manually downloading the T5X checkpoints from a cloud bucket.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten
@sanchit-gandhi
@ArthurZucker
@younesbelkada

@HuggingFaceDocBuilderDev

HuggingFaceDocBuilderDev commented Dec 16, 2022

Copy link
Copy Markdown

The documentation is not available anymore as the PR was closed or merged.

@bastings

Copy link
Copy Markdown
Contributor Author

I could use some clarification on the following: I'm missing a configuration option for T5 for the 1.0/original T5 checkpoints to have an lm_head that shares parameters with the token embeddings.

Currently there is T5Model (which returns hidden states) and T5ForConditionalGeneration (which returns logits, used for T5 v1.1 models among others). The latter assumes there is an lm_head layer, but for the 1.0 checkpoints there is no such thing, it reuses the embedding matrix to map to the vocab space.

@patrickvonplaten patrickvonplaten left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this @bastings

cc @ArthurZucker

@ArthurZucker

Copy link
Copy Markdown
Collaborator

Hey @bastings, when there is no lm_head you have to set the tie_word_embeddings to True

@ArthurZucker ArthurZucker left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very clean, thanks a lot for the addition.

Comment thread src/transformers/models/t5/convert_t5x_checkpoint_to_pytorch.py Outdated

@sanchit-gandhi sanchit-gandhi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool PR @bastings! Thanks for the addition! Do you have a set of example args I could use just to try the script out once for myself? Thanks! 🙌

Comment thread src/transformers/models/t5/convert_t5x_checkpoint_to_pytorch.py Outdated
Comment thread src/transformers/models/t5/convert_t5x_checkpoint_to_pytorch.py
@bastings bastings force-pushed the t5x_to_pytorch branch 2 times, most recently from ea37c40 to c529472 Compare December 21, 2022 12:28
@bastings

Copy link
Copy Markdown
Contributor Author

I added the instructions to the top docstring. Maybe it's ready? :-)

Comment thread src/transformers/models/t5/convert_t5x_checkpoint_to_pytorch.py Outdated
@ArthurZucker

Copy link
Copy Markdown
Collaborator

A last nit and we can merge! Thanks a lot for bearing with me 😄

@bastings

Copy link
Copy Markdown
Contributor Author

Thanks! Committed your suggestion :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

@sanchit-gandhi

Copy link
Copy Markdown
Contributor

Once the quality tests are green (requires make fixup) we can merge!

@bastings

Copy link
Copy Markdown
Contributor Author

Oh looks like the suggestion made it fail ;)

@ArthurZucker

Copy link
Copy Markdown
Collaborator

Ah, sorry then ahha, I guess the make stylewill correct that 😅

@bastings

Copy link
Copy Markdown
Contributor Author

Ah, sorry then ahha, I guess the make stylewill correct that 😅

Fixed! :)

@ArthurZucker ArthurZucker merged commit efed8a2 into huggingface:main Dec 23, 2022
MKhalusova pushed a commit to MKhalusova/transformers that referenced this pull request Dec 28, 2022
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
silverriver pushed a commit to silverriver/transformers that referenced this pull request Jan 6, 2023
…uggingface#20801)

* Add script to convert T5X T5 (v1.0 and v1.1) checkpoints to PyTorch

* Remove unnecessary check and update docstring

* Format docstring

* Fix whitespace in docstring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants