[models] allow specifying block numbers in `enable_gradient_checkpointing()` to speed up training

As brought up in https://github.com/huggingface/diffusers/pull/9982#issuecomment-2517826764 by @bghira, I think we could support this directly in the `enable_gradient_checkpointing()` method we expose for the models. 

Users could specify the block interval they want gradient checkpointing to be applied in and we take care of the rest. The code for this is simple and doesn't require any hacks. 

Gradient checkpointing is a crucial component for training/fine-tuning larger models and this technique allows for nice speed/memory trade-off. 

Cc: @a-r-r-o-w @hlky 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[models] allow specifying block numbers in `enable_gradient_checkpointing()` to speed up training #10124

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[models] allow specifying block numbers in enable_gradient_checkpointing() to speed up training #10124

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[models] allow specifying block numbers in `enable_gradient_checkpointing()` to speed up training #10124