On Jupyten notebook, nvidia H100
ERROR
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[5], line 7
4 new_special_token = "<|ar_porteno|>"
6 # Try with interpolation method
----> 7 add_new_tokens(
8 model,
9 tokenizer,
10 new_tokens=["<|ar_porteno|>"],
11 method="mean",
12 #interpolation=0.5 # 50/50 blend
13 )
15 print(f"New tokenizer size: {len(tokenizer)}")
16 print(f"Token ID: {tokenizer.convert_tokens_to_ids(new_special_token)}")
File /venv/main/lib/python3.12/site-packages/unsloth_zoo/tokenizer_utils.py:131, in add_new_tokens(model, tokenizer, new_tokens, method, interpolation)
129 # Confirm sizes are correct
130 if embedding_matrix.shape[0] != (old_input_length + len(new_tokens)):
--> 131 raise RuntimeError(
132 "Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!"
133 )
134 if lm_head_matrix.shape[0] != (old_output_length + len(new_tokens)):
135 raise RuntimeError(
136 "Unsloth: LM Head matrix size did not get resized properly. Please file a bug report!"
137 )
RuntimeError: Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!
CODE TO REPRODUCE ERROR
from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Qwen3-4B-Instruct-2507",
dtype = None, # None for auto detection
max_seq_length = 2048, # Choose any for long context!
load_in_4bit = True, # 4 bit quantization to reduce memory
load_in_8bit = False,
full_finetuning = False
)
# Add the special token BEFORE applying LoRA (following Unsloth best practices)
from unsloth import add_new_tokens
new_token = "<|special|>"
# Try with interpolation method
add_new_tokens(
model,
tokenizer,
new_tokens=[new_token],
)
print(f"New tokenizer size: {len(tokenizer)}")
print(f"Token ID: {tokenizer.convert_tokens_to_ids(new_token)}")
DETAILS
(main) root@C.27204190:/workspace$ nvidia-smi
Fri Oct 24 01:17:14 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08 Driver Version: 575.57.08 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA H100 80GB HBM3 On | 00000000:E4:00.0 Off | 0 |
| N/A 33C P0 114W / 700W | 6085MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 5139 C /venv/main/bin/python 6074MiB |
+-----------------------------------------------------------------------------------------+
(main) root@C.27204190:/workspace$ pip show unsloth transformers torch trl
Name: unsloth
Version: 2025.10.9
Summary: 2-5X faster training, reinforcement learning & finetuning
Home-page: http://www.unsloth.ai
Author: Unsloth AI team
Author-email: info@unsloth.ai
License-Expression: Apache-2.0
Location: /venv/main/lib/python3.12/site-packages
Requires: accelerate, bitsandbytes, datasets, diffusers, hf_transfer, huggingface_hub, numpy, packaging, peft, protobuf, psutil, sentencepiece, torch, torchvision, tqdm, transformers, triton, trl, tyro, unsloth_zoo, wheel, xformers
Required-by:
---
Name: transformers
Version: 4.56.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /venv/main/lib/python3.12/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: peft, trl, unsloth, unsloth_zoo
---
Name: torch
Version: 2.8.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3-Clause
Location: /venv/main/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-cufile-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-cusparselt-cu12, nvidia-nccl-cu12, nvidia-nvjitlink-cu12, nvidia-nvtx-cu12, setuptools, sympy, triton, typing-extensions
Required-by: accelerate, bitsandbytes, cut-cross-entropy, peft, torchvision, unsloth, unsloth_zoo, xformers
---
Name: trl
Version: 0.23.0
Summary: Train transformer language models with reinforcement learning.
Home-page: https://github.com/huggingface/trl
Author: Leandro von Werra
Author-email: leandro.vonwerra@gmail.com
License:
Location: /venv/main/lib/python3.12/site-packages
Requires: accelerate, datasets, transformers
Required-by: unsloth, unsloth_zoo
On Jupyten notebook, nvidia H100
ERROR
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In[5], line 7 4 new_special_token = "<|ar_porteno|>" 6 # Try with interpolation method ----> 7 add_new_tokens( 8 model, 9 tokenizer, 10 new_tokens=["<|ar_porteno|>"], 11 method="mean", 12 #interpolation=0.5 # 50/50 blend 13 ) 15 print(f"New tokenizer size: {len(tokenizer)}") 16 print(f"Token ID: {tokenizer.convert_tokens_to_ids(new_special_token)}") File /venv/main/lib/python3.12/site-packages/unsloth_zoo/tokenizer_utils.py:131, in add_new_tokens(model, tokenizer, new_tokens, method, interpolation) 129 # Confirm sizes are correct 130 if embedding_matrix.shape[0] != (old_input_length + len(new_tokens)): --> 131 raise RuntimeError( 132 "Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!" 133 ) 134 if lm_head_matrix.shape[0] != (old_output_length + len(new_tokens)): 135 raise RuntimeError( 136 "Unsloth: LM Head matrix size did not get resized properly. Please file a bug report!" 137 ) RuntimeError: Unsloth: Embedding matrix size did not get resized properly. Please file a bug report!CODE TO REPRODUCE ERROR
DETAILS