Skip to content

RwkvForCausalLM does not support gradient checkpointing. #24831

@jonataslaw

Description

@jonataslaw

System Info

Is there some reason for RwkvForCausalLM does not support gradient checkpointing, since RWKV-LM supports it?

@ArthurZucker and @younesbelkada

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

model.gradient_checkpointing_enable()

ValueError(f"{self.__class__.__name__} does not support gradient checkpointing.")
ValueError: RwkvForCausalLM does not support gradient checkpointing.

Expected behavior

No errors, as long as RWKV-LM supports it.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions