Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) by IrisRainbowNeko · Pull Request #3404 · huggingface/diffusers

IrisRainbowNeko · 2023-05-12T02:06:34Z

Using torch.utils.checkpoint.checkpoint directly will cause the parameters in the checkpoint section to not be learned when part of the model parameters are freezed. As these discussions state:

https://discuss.pytorch.org/t/use-of-torch-utils-checkpoint-checkpoint-causes-simple-model-to-diverge/116271

https://discuss.pytorch.org/t/checkpoint-with-no-grad-requiring-inputs-problem/19117/19

In pytroch versions larger than 1.11.0, the ```use_reentrant=False`` can be added to fix this bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False)#3404

Fix gradient checkpointing bugs in freezing part of models (requires_grad=False)#3404
patrickvonplaten merged 5 commits intohuggingface:mainfrom
IrisRainbowNeko:grad_ckpt_part

IrisRainbowNeko commented May 12, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

IrisRainbowNeko commented May 12, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants