Hi @lvwerra, thanks for the code.
For commit #caed471, the changes made in 05-gpt2-sentiment-control.ipynb and 04-gpt2-sentiment-ppo-training.ipynb, more specifically, the change of BERT to DistilBERT in config leads to errors during training.
The error is "index out of range" for attention_masks generated by build_bert_batch_from_txt.
My understanding of this error is that the sentiment_inputs contain embeddings that were not in the vocabulary of DistilBERT and therefore do not hold a value in the attention masks as well.
Therefore, the config of those files should be changed to "cls_model_name": "lvwerra/bert-imdb".
Thanks!
Hi @lvwerra, thanks for the code.
For commit #caed471, the changes made in
05-gpt2-sentiment-control.ipynband04-gpt2-sentiment-ppo-training.ipynb, more specifically, the change of BERT to DistilBERT in config leads to errors during training.The error is "index out of range" for attention_masks generated by build_bert_batch_from_txt.
My understanding of this error is that the sentiment_inputs contain embeddings that were not in the vocabulary of DistilBERT and therefore do not hold a value in the attention masks as well.
Therefore, the config of those files should be changed to
"cls_model_name": "lvwerra/bert-imdb".Thanks!