ConvBERT Model#9717
Merged
LysandreJik merged 52 commits intomasterfrom Jan 27, 2021
Merged
Conversation
stefan-it
reviewed
Jan 21, 2021
Contributor
There was a problem hiding this comment.
Looks awesome! Great work! I've added mostly nits. Two things I'd like to change before merging though:
- In the PyTorch modeling file, we pass the
attention_maskto mask some tokens. In the case of cross-attention, we pass bothencoder_attention_maskandattention_maskand then just setattention_masktoencoder_attention_maskbecause we don't needattention_maskin this case. This means that there is always only really one attention_mask that is used. Therefore I think, it's better to just not have aencoder_attention_maskfunction argument IMO and directly setattention_mask=encoder_attention_mask. It is cleaner and easier to understand for the reader. - Don't really like the
param_mapping.pyfile. We don't have this for any other model and it's only used in one function if I see correctly. It goes a bit against our philosophy to have "as much as possible in one file" => so I'd prefer to have the function of this file directly in the conversion function even if it means we add 100 more lines.
Also, I think you forgot to add the model to the README.md (I'll forget it all the time as well :D)
Collaborator
Oh and while you are at it, a short entry in the |
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
patrickvonplaten
approved these changes
Jan 26, 2021
Contributor
patrickvonplaten
left a comment
There was a problem hiding this comment.
Thanks for making the changes
LysandreJik
approved these changes
Jan 27, 2021
Member
LysandreJik
left a comment
There was a problem hiding this comment.
LGTM! Thanks for your work @abhishekkrthakur
Qbiwan
pushed a commit
to Qbiwan/transformers
that referenced
this pull request
Jan 31, 2021
* finalize convbert * finalize convbert * fix * fix * fix * push * fix * tf image patches * fix torch model * tf tests * conversion * everything aligned * remove print * tf tests * fix tf * make tf tests pass * everything works * fix init * fix * special treatment for sepconv1d * style * 🙏🏽 * add doc and cleanup * add electra test again * fix doc * fix doc again * fix doc again * Update src/transformers/modeling_tf_pytorch_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/conv_bert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/conv_bert/configuration_conv_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * conv_bert -> convbert * more fixes from review * add conversion script * dont use pretrained embed * unused config * suggestions from julien * some more fixes * p -> param * fix copyright * fix doc * Update src/transformers/models/convbert/configuration_convbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * comments from reviews * fix-copies * fix style * revert shape_list Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.