Skip to content

Fix Dropout Implementation in Graphormer#24817

Merged
amyeroberts merged 1 commit intohuggingface:mainfrom
alexanderkrauck:fixing_dropout
Sep 8, 2023
Merged

Fix Dropout Implementation in Graphormer#24817
amyeroberts merged 1 commit intohuggingface:mainfrom
alexanderkrauck:fixing_dropout

Conversation

@alexanderkrauck
Copy link
Contributor

@alexanderkrauck alexanderkrauck commented Jul 14, 2023

What does this PR do?

This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation (https://github.com/microsoft/Graphormer) and improving performance. Specifically:

  1. The attention_dropout variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use attention_dropout instead of the regular dropout.
  2. The activation_dropout for the activations in the feed-forward layers was missing. Instead, the regular dropout was used. This commit adds activation_dropout to the feed-forward layers and to the GraphormerConfig including documentation.

These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@clefourrier

This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically:

1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`.
2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers.

These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
@alexanderkrauck
Copy link
Contributor Author

Hi @clefourrier and others,
this is the first time contributing to Huggingface for me. In the course of my thesis I found some improvements/fixes to your current Graphormer implementation including this one which is quite simple but has a big impact and should be easy to review. I plan to make some more pull requests with more performance related changes to speed Graphormer up and possibly also to add the 3D version in the following days/weeks. Feel free to reach out.
Best wishes, Alexander

@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@alexanderkrauck
Copy link
Contributor Author

It still needs to be adressed! @clefourrier or whoever is responsible for it, am I doing something wrong with my pull request or what is taking so long for anyone to anwer?! How am I supposed to contribute if I am being ignored.

@clefourrier
Copy link
Member

Hi @alexanderkrauck !
I have been very busy taking care of the Open LLM Leaderboard, and I put the graph transformers issues on the backburner for the summer. I was hoping to come back to this quite earlier than now, I'm very sorry about that.
I'll do my best to come back to these before the end of September

Copy link
Member

@clefourrier clefourrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for this fix

@clefourrier
Copy link
Member

@amyeroberts or @ArthurZucker ?

Copy link
Contributor

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@amyeroberts amyeroberts merged commit 0c67a72 into huggingface:main Sep 8, 2023
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically:

1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`.
2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers.

These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants