T5Tokenizer load fails when added special tokens and saved with save_pretrained

### System Info

- `transformers` version: 4.34.1
- Platform: Linux-5.4.0-149-generic-x86_64-with-glibc2.31
- Python version: 3.10.11
- Huggingface_hub version: 0.17.3
- Safetensors version: 0.4.0
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): 2.0.1 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No

### Who can help?

@ArthurZucker

### Information

- [ ] The official example scripts
- [X] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [X] My own task or dataset (give details below)

### Reproduction

It can be reproduced on python console.

```python
>>> import transformers
>>> tokenizer = transformers.AutoTokenizer.from_pretrained("google/flan-t5-base")
Downloading (…)okenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.54k/2.54k [00:00<00:00, 25.3MB/s]
Downloading spiece.model: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 792k/792k [00:00<00:00, 20.2MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.42M/2.42M [00:00<00:00, 2.47MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.20k/2.20k [00:00<00:00, 15.9MB/s]
>>> tokenizer.add_special_tokens({"additional_special_tokens": ["<1>", "<2>"]})
2
>>> tokenizer.save_pretrained("/tmp/tokenizer")
('/tmp/tokenizer/tokenizer_config.json', '/tmp/tokenizer/special_tokens_map.json', '/tmp/tokenizer/tokenizer.json')
>>> new_tokenizer = transformers.AutoTokenizer.from_pretrained("/tmp/tokenizer")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 751, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2017, in from_pretrained
    return cls._from_pretrained(
  File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2249, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/t5/tokenization_t5_fast.py", line 127, in __init__
    raise ValueError(
ValueError: Both extra_ids (100) and additional_special_tokens (['<1>', '<2>']) are provided to T5Tokenizer. In this case the additional_special_tokens must include the extra_ids tokens
>>> transformers.__version__
'4.34.1'
```

### Expected behavior

It works well on version `4.33.x`

```python
>>> import transformers
>>> tokenizer = transformers.AutoTokenizer.from_pretrained("google/flan-t5-base")
>>> tokenizer.add_special_tokens({"additional_special_tokens": ["<1>", "<2>"]})
2
>>> tokenizer.save_pretrained("/tmp/tokenizer")
('/tmp/tokenizer/tokenizer_config.json', '/tmp/tokenizer/special_tokens_map.json', '/tmp/tokenizer/tokenizer.json')
>>> new_tokenizer = transformers.AutoTokenizer.from_pretrained("/tmp/tokenizer")
>>> transformers.__version__
'4.33.3'
```

I found related issue (#26536). 

Maybe this is T5-specific issue, because it worked well using different models like `bert-base-cased` or `gpt2`

Thank you in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T5Tokenizer load fails when added special tokens and saved with save_pretrained #26951

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

T5Tokenizer load fails when added special tokens and saved with save_pretrained #26951

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions