Skip to content

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False)#3404

Merged
patrickvonplaten merged 5 commits intohuggingface:mainfrom
IrisRainbowNeko:grad_ckpt_part
May 17, 2023
Merged

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False)#3404
patrickvonplaten merged 5 commits intohuggingface:mainfrom
IrisRainbowNeko:grad_ckpt_part

Conversation

@IrisRainbowNeko
Copy link
Copy Markdown
Contributor

Using torch.utils.checkpoint.checkpoint directly will cause the parameters in the checkpoint section to not be learned when part of the model parameters are freezed. As these discussions state:

https://discuss.pytorch.org/t/use-of-torch-utils-checkpoint-checkpoint-causes-simple-model-to-diverge/116271

https://discuss.pytorch.org/t/checkpoint-with-no-grad-requiring-inputs-problem/19117/19

In pytroch versions larger than 1.11.0, the ```use_reentrant=False`` can be added to fix this bug.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants