Skip to content

[Bug] add_new_tokens -> Embedding matrix size did not get resized properly #3502

@alsoalter85

Description

@alsoalter85

On Jupyten notebook, nvidia H100

ERROR

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[5], line 7
      4 new_special_token = "<|ar_porteno|>"
      6 # Try with interpolation method
----> 7 add_new_tokens(
      8   model,
      9   tokenizer,
     10   new_tokens=["<|ar_porteno|>"],
     11   method="mean",
     12   #interpolation=0.5  # 50/50 blend
     13 )
     15 print(f"New tokenizer size: {len(tokenizer)}")
     16 print(f"Token ID: {tokenizer.convert_tokens_to_ids(new_special_token)}")

File /venv/main/lib/python3.12/site-packages/unsloth_zoo/tokenizer_utils.py:131, in add_new_tokens(model, tokenizer, new_tokens, method, interpolation)
    129 # Confirm sizes are correct
    130 if embedding_matrix.shape[0] != (old_input_length  + len(new_tokens)):
--> 131     raise RuntimeError(
    132         "Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!"
    133     )
    134 if lm_head_matrix.shape[0]   != (old_output_length + len(new_tokens)):
    135     raise RuntimeError(
    136         "Unsloth: LM Head matrix size did not get resized properly. Please file a bug report!"
    137     )

RuntimeError: Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!

CODE TO REPRODUCE ERROR

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Qwen3-4B-Instruct-2507",
    dtype = None, # None for auto detection
    max_seq_length = 2048, # Choose any for long context!
    load_in_4bit = True,  # 4 bit quantization to reduce memory
    load_in_8bit = False,
    full_finetuning = False
)

# Add the special token BEFORE applying LoRA (following Unsloth best practices)
from unsloth import add_new_tokens

new_token = "<|special|>"

# Try with interpolation method
add_new_tokens(
  model,
  tokenizer,
  new_tokens=[new_token],
)

print(f"New tokenizer size: {len(tokenizer)}")
print(f"Token ID: {tokenizer.convert_tokens_to_ids(new_token)}")

DETAILS

(main) root@C.27204190:/workspace$ nvidia-smi
Fri Oct 24 01:17:14 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08              Driver Version: 575.57.08      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:E4:00.0 Off |                    0 |
| N/A   33C    P0            114W /  700W |    6085MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            5139      C   /venv/main/bin/python                  6074MiB |
+-----------------------------------------------------------------------------------------+
(main) root@C.27204190:/workspace$ pip show unsloth transformers torch trl
Name: unsloth
Version: 2025.10.9
Summary: 2-5X faster training, reinforcement learning & finetuning
Home-page: http://www.unsloth.ai
Author: Unsloth AI team
Author-email: info@unsloth.ai
License-Expression: Apache-2.0
Location: /venv/main/lib/python3.12/site-packages
Requires: accelerate, bitsandbytes, datasets, diffusers, hf_transfer, huggingface_hub, numpy, packaging, peft, protobuf, psutil, sentencepiece, torch, torchvision, tqdm, transformers, triton, trl, tyro, unsloth_zoo, wheel, xformers
Required-by: 
---
Name: transformers
Version: 4.56.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /venv/main/lib/python3.12/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: peft, trl, unsloth, unsloth_zoo
---
Name: torch
Version: 2.8.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3-Clause
Location: /venv/main/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-cufile-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-cusparselt-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvtx-cu12, setuptools, sympy, triton, typing-extensions
Required-by: accelerate, bitsandbytes, cut-cross-entropy, peft, torchvision, unsloth, unsloth_zoo, xformers
---
Name: trl
Version: 0.23.0
Summary: Train transformer language models with reinforcement learning.
Home-page: https://github.com/huggingface/trl
Author: Leandro von Werra
Author-email: leandro.vonwerra@gmail.com
License: 
Location: /venv/main/lib/python3.12/site-packages
Requires: accelerate, datasets, transformers
Required-by: unsloth, unsloth_zoo

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions