task-aware-distillation icon indicating copy to clipboard operation
task-aware-distillation copied to clipboard

Request for Additional Code and Hyperparameters

Open jian53286 opened this issue 2 years ago • 4 comments

Dear Prof. Chen Liang,

We're working on a project involving language model compression and your work with TED has been insightful. We are now trying to compare TED with KD and LWD techniques, and evaluate TED on GLUE tasks.

We are having difficulty reproducing the GLUE benchmark results as reported in your paper. If possible, could you share the baseline KD and LWD frameworks code and the code for TED evaluation on GLUE?

If sharing the code is not feasible, could you please provide the hyperparameters used in your experiments? This would greatly assist our research.

Best, Chengfei Liu

jian53286 avatar Jul 29 '23 02:07 jian53286

Hi Chengfei, thanks for your interests in TED. We will be working on adding the GLUE codes and hyperparameters soon. Please stay tuned.

cliang1453 avatar Jul 31 '23 17:07 cliang1453

Thanks for your insightful work! Could you share your code about GPT-2? It will be super helpful for my research. Thanks a lot!

aaronma2020 avatar Aug 14 '23 11:08 aaronma2020

Hi @jian53286 , the code for GLUE has been released. Hi @aaronma2020 , we will work on adding GPT-2 soon.

cliang1453 avatar Aug 28 '23 04:08 cliang1453