Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) by IrisRainbowNeko · Pull Request #23422 · huggingface/transformers

IrisRainbowNeko · 2023-05-17T03:49:06Z

What does this PR do?

Same as the PR I opened in diffusers.

Using torch.utils.checkpoint.checkpoint directly will cause the parameters in the checkpoint section to not be learned when part of the model parameters are freezed. As these discussions state:

https://discuss.pytorch.org/t/use-of-torch-utils-checkpoint-checkpoint-causes-simple-model-to-diverge/116271

https://discuss.pytorch.org/t/checkpoint-with-no-grad-requiring-inputs-problem/19117/19

In pytroch versions larger than 1.11.0, the use_reentrant=False can be added to fix this bug.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2023-05-17T04:09:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

github-actions · 2023-06-16T15:02:09Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

gradient checkpointing bug fix

983a77e

IrisRainbowNeko mentioned this pull request May 17, 2023

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) huggingface/diffusers#3404

Merged

gradient checkpointing bug fix

20b2746

github-actions Bot closed this Jun 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False)#23422

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False)#23422
IrisRainbowNeko wants to merge 2 commits intohuggingface:mainfrom
IrisRainbowNeko:grad_ckpt_part

IrisRainbowNeko commented May 17, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 17, 2023

Uh oh!

github-actions Bot commented Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

IrisRainbowNeko commented May 17, 2023

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented May 17, 2023

Uh oh!

github-actions Bot commented Jun 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants