Skip to content

ConvBERT Model#9717

Merged
LysandreJik merged 52 commits intomasterfrom
cbert
Jan 27, 2021
Merged

ConvBERT Model#9717
LysandreJik merged 52 commits intomasterfrom
cbert

Conversation

@abhishekkrthakur
Copy link
Copy Markdown
Contributor

No description provided.

@abhishekkrthakur abhishekkrthakur changed the title finalize convbert Add ConvBERT Model Jan 21, 2021
Comment thread src/transformers/models/conv_bert/tokenization_conv_bert_fast.py Outdated
@abhishekkrthakur abhishekkrthakur changed the title Add ConvBERT Model ConvBERT Model Jan 23, 2021
@abhishekkrthakur abhishekkrthakur changed the title ConvBERT Model ConvBert Model Jan 23, 2021
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_convbert.py
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_tf_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_tf_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_tf_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_tf_convbert.py Outdated
Comment thread src/transformers/models/convbert/modeling_tf_convbert.py
Comment thread tests/test_modeling_convbert.py
Comment thread src/transformers/models/convbert/param_mapping.py Outdated
Copy link
Copy Markdown
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome! Great work! I've added mostly nits. Two things I'd like to change before merging though:

  1. In the PyTorch modeling file, we pass the attention_mask to mask some tokens. In the case of cross-attention, we pass both encoder_attention_mask and attention_mask and then just set attention_mask to encoder_attention_mask because we don't need attention_mask in this case. This means that there is always only really one attention_mask that is used. Therefore I think, it's better to just not have a encoder_attention_mask function argument IMO and directly set attention_mask=encoder_attention_mask. It is cleaner and easier to understand for the reader.
  2. Don't really like the param_mapping.py file. We don't have this for any other model and it's only used in one function if I see correctly. It goes a bit against our philosophy to have "as much as possible in one file" => so I'd prefer to have the function of this file directly in the conversion function even if it means we add 100 more lines.

Also, I think you forgot to add the model to the README.md (I'll forget it all the time as well :D)

@sgugger
Copy link
Copy Markdown
Collaborator

sgugger commented Jan 26, 2021

Also, I think you forgot to add the model to the README.md (I'll forget it all the time as well :D)

Oh and while you are at it, a short entry in the model_summary would be great too!

Copy link
Copy Markdown
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes

Copy link
Copy Markdown
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for your work @abhishekkrthakur

@LysandreJik LysandreJik merged commit f617490 into master Jan 27, 2021
@LysandreJik LysandreJik deleted the cbert branch January 27, 2021 08:20
Qbiwan pushed a commit to Qbiwan/transformers that referenced this pull request Jan 31, 2021
* finalize convbert

* finalize convbert

* fix

* fix

* fix

* push

* fix

* tf image patches

* fix torch model

* tf tests

* conversion

* everything aligned

* remove print

* tf tests

* fix tf

* make tf tests pass

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽

* add doc and cleanup

* add electra test again

* fix doc

* fix doc again

* fix doc again

* Update src/transformers/modeling_tf_pytorch_utils.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update docs/source/model_doc/conv_bert.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/configuration_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/conv_bert/configuration_conv_bert.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* conv_bert -> convbert

* more fixes from review

* add conversion script

* dont use pretrained embed

* unused config

* suggestions from julien

* some more fixes

* p -> param

* fix copyright

* fix doc

* Update src/transformers/models/convbert/configuration_convbert.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments from reviews

* fix-copies

* fix style

* revert shape_list

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Core: Modeling Internals of the library; Models. PR for Model Addition

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants