Excited to present our work on multi-decoder summarization models "HydraSum" later this week at #EMNLP! work w/ @nazneenrajani @owenhaoliu & @iam_wkr during my Salesforce internship! arxiv.org/abs/2110.04400 I will present this in person on 9th Dec, 11am in the summ session 🧵 1/
Shows training and inference for hydrasum. During training, single decoders are replaced by 2 decoders in a mixture-of-experts. 

During inference, the figure shows that you can sample from individual decoders or their mixture (can even specify the gate manually)