Skip to content

Add BioGPT#20420

Merged
sgugger merged 40 commits intohuggingface:mainfrom
kamalkraj:BioGPT
Dec 5, 2022
Merged

Add BioGPT#20420
sgugger merged 40 commits intohuggingface:mainfrom
kamalkraj:BioGPT

Conversation

@kamalkraj
Copy link
Contributor

@kamalkraj kamalkraj commented Nov 23, 2022

What does this PR do?

Adding BioGPT

Original Implementation and weights - https://github.com/microsoft/BioGPT

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sgugger @patrickvonplaten

@kamalkraj kamalkraj changed the title [WIP] BioGPT [WIP] Add BioGPT Nov 24, 2022
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 24, 2022

The documentation is not available anymore as the PR was closed or merged.

@kamalkraj kamalkraj changed the title [WIP] Add BioGPT Add BioGPT Nov 25, 2022
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this great contribution and introducing one of the first bio-medical text model to transformers !
I left a couple of minor comments, overall the PR looks really good to me! The main point being changing the naming of xxxLMHeadModel that needs to be changed to xxxForCausalLM, I also left a comment about layerdrop and output_hidden_states, let me know if this refactoring is possible!
Another point we should discuss is whether we can use smaller model for integration tests, let me know if you need help on that (no need to train the model)
Great effort on the integration side 💪 We should be very close merging this!

@kamalkraj
Copy link
Contributor Author

@younesbelkada
Done changes according to your suggestions.
Thanks for the review

@younesbelkada
Copy link
Contributor

Thanks @kamalkraj
Let me give it another round of review and I'll get back to you

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the great efforts! I left few minor comments that needs to be double checked, otherwise everything looks great to me!

Comment on lines +546 to +551
if output_hidden_states:
all_hidden_states += (hidden_states,)
dropout_probability = random.uniform(0, 1)
if self.training and (dropout_probability < self.layerdrop):
continue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clarification, I think that you are right here, I am fine with this change. Let's keep it like this

def test_batch_generation(self):
model = BioGptLMHeadModel.from_pretrained("kamalkraj/biogpt")
model.to(torch_device)
tokenizer = BioGptTokenizer.from_pretrained("kamalkraj/biogpt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But even if the model outputs gibberish I'd expect the logits to be different if someone changes the code in the future, so I guess we'll be able to flag it.
In my opinion I think that it's fine as it is but let's see what others think for final approval - so let's keep this as unresolved for now

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for your very clean contribution of the first (if I am not mistaken) bio-medical model!
Clean code & documentation and nice tests 💪🏻
Leaving now the PR to @sgugger for a final review
Thanks!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this model! Once all comments have been resolved, we'll need to move the checkpoints to the microsoft org and adapt all references in this PR :-)

logger = logging.get_logger(__name__)

BIOGPT_PRETRAINED_CONFIG_ARCHIVE_MAP = {
"kamalkraj/biogpt": "https://huggingface.co/kamalkraj/biogpt/resolve/main/config.json",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpoints will need to be transferred to the microsoft org before we merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know once the checkpoints are transferred, I can make the changes.
Thanks

@kamalkraj
Copy link
Contributor Author

@sgugger
Done changes according to your suggestions.
Thanks for the review

@younesbelkada
Copy link
Contributor

Hi @kamalkraj
The repo has been moved to microsoft: https://huggingface.co/microsoft/biogpt
Could you please update the PR accordingly? Also it seems that you need to rebase to main
Thanks!

@kamalkraj
Copy link
Contributor Author

@younesbelkada Done the changes
Thanks

@younesbelkada
Copy link
Contributor

thanks @kamalkraj !
It seems that styling tests are failing, could you please run make fixup?

@kamalkraj
Copy link
Contributor Author

@younesbelkada fixed

@younesbelkada
Copy link
Contributor

Thanks so much @kamalkraj !
Let's leave it now to @sgugger to give his review ;)
Thanks!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for all your work on this!

@sgugger sgugger merged commit 13e7366 into huggingface:main Dec 5, 2022
mpierrau pushed a commit to mpierrau/transformers that referenced this pull request Dec 15, 2022
* biogpt initial commit

* updated init

* fix faster decoding with use_cache

* 1. fix input_ids and input_embeds with correct device
2. added _keys_to_ignore_on_load_missing
3. updated prepare_inputs_for_generation

* add activation_dropout and scale_embedding

* replace fsmt attention with bart attention

* added test

* run make fix-copies

* doc init and fix build

* updated README with proper information

* 1. added tips to docs
2. updated BioGptTokenizer func

* 1. added tokenizer test
2. refactor tokenizer

* make fixup

* add biogpt fairseq to hf converter

* updated layer names more
similar to original checkpoints

* config update doc string and set defaults

* added "#copied" from bart model and
updated doc strings

* enable model_input_names in tokenizer

* 1.  positionalembedding depending on attention_mask
2. added attention mask to prepare for generation

* added test to verify past and generation

* BioGptLMHeadModel -> BioGptForCausalLM

* fix typo

* tokenization and test
Copyright and updated assertion

* updated Copyright and
one func at time in line

* Copyright updates and
minor doc fix

* replace assertion with ValueError

* rm extra space

* added code syntax

* revert cmnt position change

* add tokenizer to auto

* updated doc string

* tokenizer doc string update

* biogpt hub model update to microsoft/biogpt

* make fixup

* rm cmnt to fix flake8 5.0.4 vs 6 error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants