[core] Fix use_reentrant issues#1036
Conversation
|
Is this ready for review? |
|
Almost, I'll update the PR in a bit with more description |
|
The documentation is not available anymore as the PR was closed or merged. |
|
huggingface/transformers#27020 being merged, this PR is ready for review! |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks for making the big fix in transformers and updating PEFT to enable it. I have only a few comments, please check them out.
src/peft/utils/other.py
Outdated
| If True, use gradient checkpointing to save memory at the expense of slower backward pass. | ||
| gradient_checkpointing_kwargs (`dict`, *optional*, defaults to `None`): | ||
| Keyword arguments to pass to the gradient checkpointing function, e.g. `use_reentrant=True`. Note this is | ||
| only available in the latest transformers versions. |
There was a problem hiding this comment.
It would be better to specify the exact transformers version, because "latest" is relative.
src/peft/utils/other.py
Outdated
| use_gradient_checkpointing (`bool`, *optional*, defaults to `True`): | ||
| If True, use gradient checkpointing to save memory at the expense of slower backward pass. | ||
| gradient_checkpointing_kwargs (`dict`, *optional*, defaults to `None`): | ||
| Keyword arguments to pass to the gradient checkpointing function, e.g. `use_reentrant=True`. Note this is |
There was a problem hiding this comment.
Could you explain what use_reentrant does?
There was a problem hiding this comment.
I referred users to check the pytorch documentation, let me know if you want me to detail more
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks for adjusting the code. It now looks good to me.
|
Thanks, I'll merge after finishing some experiments with huggingface/transformers#27068 and huggingface/trl#912 |
Hello Younes, thank you for fixing the gradient checkpointing related issues as per our discussions. Just a nit, here you meant |
|
AH yes correct @pacman100 ! |
* fix use_reentrant issues * fix * fixup * address comments. * add warnings * oops * fix * quality
Partially fixes: huggingface/trl#835
This PR depends on huggingface/transformers#27020 - with that PR, we introduce a new argument in the
gradient_checkpointing_enable()API in order for users to passgradient_checkpointing_kwargs. To fix some issues with DDP and gradient_checkpointing, it is recommended to useuse_reentrant=Truewhich is the fix for huggingface/trl#835Therefore I propose to expose an optional argument
gradient_checkpointing_kwargsinprepare_model_for_kbit_trainingcc @BenjaminBossan @pacman100