Skip to content

BERT vs DistilBERT - mismatch of Attention Masks #31

@dhruv2601

Description

@dhruv2601

Hi @lvwerra, thanks for the code.

For commit #caed471, the changes made in 05-gpt2-sentiment-control.ipynb and 04-gpt2-sentiment-ppo-training.ipynb, more specifically, the change of BERT to DistilBERT in config leads to errors during training.

The error is "index out of range" for attention_masks generated by build_bert_batch_from_txt.
My understanding of this error is that the sentiment_inputs contain embeddings that were not in the vocabulary of DistilBERT and therefore do not hold a value in the attention masks as well.

Therefore, the config of those files should be changed to "cls_model_name": "lvwerra/bert-imdb".

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions