Skip to content

Remove token_type_ids from default TF GPT-2 signature#26962

Merged
Rocketknight1 merged 1 commit intomainfrom
tfgpt2_input_signature_update
Oct 23, 2023
Merged

Remove token_type_ids from default TF GPT-2 signature#26962
Rocketknight1 merged 1 commit intomainfrom
tfgpt2_input_signature_update

Conversation

@Rocketknight1
Copy link
Member

@Rocketknight1 Rocketknight1 commented Oct 20, 2023

Although GPT-2 supports token_type_ids, the implementation is very weird (token embeddings are also used as token type embeddings!) and in practice token_type_ids=None is used almost exclusively.

For most models, you could mimic the effects of token_type_ids=None by just passing an all-zeros array, but this does not work for GPT-2 because it completely skips the embeddings when token_type_ids=None. This means that a model exported with a token_type_ids input cannot be coerced to behave correctly.

To stop this tripping up other users, we remove token_type_ids from the GPT-2 input sig. This issue is specific to GPT-2 - most other models have more reasonable ways of handling token_type_ids and shouldn't be affected.

Fixes #26783

@Rocketknight1 Rocketknight1 requested a review from gante October 20, 2023 13:30
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 20, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@Rocketknight1 Rocketknight1 merged commit f7354a3 into main Oct 23, 2023
@Rocketknight1 Rocketknight1 deleted the tfgpt2_input_signature_update branch October 23, 2023 15:18
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
)

Remove token_type_ids from default GPT-2 signature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent input signatures for gpt2-medium tensorflow

3 participants