Code for paper titled "Task-agnostic Distillation of Encoder-Decoder Language Models"
The code typically follows the pipeline of MiniMA with additional modeling files.
Raise an issue if you are interested, otherwise you would learn from the distillation pipeline of MiniMA instead.