Feature request
Hi,
I'm looking into BART docs. Seems that the provided examples are on fine-tuning BART on Seq2Seq summarization tasks. I'm wondering if there is any example on pertaining BART's "Language Model" itself, with the pre-training objectives (Token infilling, Token Masking, etc.) that are mentioned in the original paper. I was looking into this a couple of months ago and found this thread: #4151 and a relevant issue in fairseq: facebookresearch/fairseq#1899. Now decided to ask it directly here to see if there has been any update so far.
Thanks, @patrickvonplaten @patil-suraj,
Motivation
Making BART for further pre-training (on Language Model).
Feature request
Hi,
I'm looking into BART docs. Seems that the provided examples are on fine-tuning BART on Seq2Seq summarization tasks. I'm wondering if there is any example on pertaining BART's "Language Model" itself, with the pre-training objectives (Token infilling, Token Masking, etc.) that are mentioned in the original paper. I was looking into this a couple of months ago and found this thread: #4151 and a relevant issue in fairseq: facebookresearch/fairseq#1899. Now decided to ask it directly here to see if there has been any update so far.
Thanks, @patrickvonplaten @patil-suraj,
Motivation
Making BART for further pre-training (on Language Model).