Skip to content

[Flax] Add other BERT classes#10977

Merged
patrickvonplaten merged 5 commits intohuggingface:masterfrom
patrickvonplaten:add_other_bert_classes
Mar 31, 2021
Merged

[Flax] Add other BERT classes#10977
patrickvonplaten merged 5 commits intohuggingface:masterfrom
patrickvonplaten:add_other_bert_classes

Conversation

@patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Mar 30, 2021

What does this PR do?

This PR adds the other BERT model classes for Flax.

Also the following checkpoints have been uploaded for Flax:

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, LGTM! Great work implementing all of these. Will RoBERTa follow?

It is so similar to the PyTorch implementation it seems a script could take care of the implementation by copying the PyTorch one and replacing a few strings!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I don't know what happened to your formatter but we can definitely have most of submodule definitions fit on one line ;-)

@patrickvonplaten patrickvonplaten merged commit e87505f into huggingface:master Mar 31, 2021
@avital
Copy link
Contributor

avital commented Mar 31, 2021

It is so similar to the PyTorch implementation it seems a script could take care of the implementation by copying the PyTorch one and replacing a few strings!

@marcvanzee and I were also wondering about this in general -- is there a 80/20 solution that requires user input in some cases? It would have to not introduce silent errors (e.g. a model that seems to run the same but differs in some hard-to-find way).

Iwontbecreative pushed a commit to Iwontbecreative/transformers that referenced this pull request Jul 15, 2021
* add first code structures

* add all bert models

* add to init and docs

* correct docs

* make style
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants