Request for Additional Code and Hyperparameters
Dear Prof. Chen Liang,
We're working on a project involving language model compression and your work with TED has been insightful. We are now trying to compare TED with KD and LWD techniques, and evaluate TED on GLUE tasks.
We are having difficulty reproducing the GLUE benchmark results as reported in your paper. If possible, could you share the baseline KD and LWD frameworks code and the code for TED evaluation on GLUE?
If sharing the code is not feasible, could you please provide the hyperparameters used in your experiments? This would greatly assist our research.
Best, Chengfei Liu
Hi Chengfei, thanks for your interests in TED. We will be working on adding the GLUE codes and hyperparameters soon. Please stay tuned.
Thanks for your insightful work! Could you share your code about GPT-2? It will be super helpful for my research. Thanks a lot!
Hi @jian53286 , the code for GLUE has been released. Hi @aaronma2020 , we will work on adding GPT-2 soon.