[Bugfix] Catch and log invalid token ids in detokenizer #2 by njhill · Pull Request #26445 · vllm-project/vllm

njhill · 2025-10-08T21:45:02Z

This is an update to the "workaround" added in #24351.

That PR insulates against occasional negative token ids that can be produced occasionally, though we still don't know the root cause (see #21951).

With the update to tokenizers 0.22.1, this error manifests as a TypeError rather than an OverflowError, so the patch needs to be updated to account for this.

Mitigates #26438, #26071, #25821.

This is an update to the "workaround" added in vllm-project#24351. That PR insulates against occasional negative token ids that can be produced occasionally, though we still don't know the root cause. With the update to tokenizers 0.22.1, this error manifests as a TypeError rather than an OverflowError, so the patch needs to be updated to account for this. Signed-off-by: Nick Hill <nhill@redhat.com>

gemini-code-assist

Code Review

I've reviewed your pull request. The change to catch TypeError is correct based on the updated behavior of the tokenizers library. I've found one high-severity issue related to this change that could cause problems in the exception handling logic. Please see my detailed comment below.

vllm/v1/engine/detokenizer.py

Signed-off-by: Nick Hill <nhill@redhat.com>

yewentao256

LGTM, thanks for the work!

…to loader * 'loader' of https://github.com/dsxsteven/vllm_splitPR: (778 commits) [torchao] Add support for ModuleFqnToConfig using regex (vllm-project#26001) Add: Support for multiple hidden layers in Eagle3 (vllm-project#26164) Enable `RMSNorm` substitution for Transformers backend (vllm-project#26353) [Model] Gemma3: Fix GGUF loading and quantization (vllm-project#26189) Bump Flashinfer to v0.4.0 (vllm-project#26326) Update Dockerfile and install runai-model-streamer[gcs] package (vllm-project#26464) [Core] Relax the LoRA max rank (vllm-project#26461) [CI/Build] Fix model nightly tests (vllm-project#26466) [Hybrid]: Decouple Kernel Block Size from KV Page Size (vllm-project#24486) [Core][KVConnector] Propagate all tokens on resumed preemptions (vllm-project#24926) [MM][Doc] Add documentation for configurable mm profiling (vllm-project#26200) [Hardware][AMD] Enable FlexAttention backend on ROCm (vllm-project#26439) [Bugfix] Incorrect another MM data format in vllm bench throughput (vllm-project#26462) [Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-project#26445) [Minor] Change warning->warning_once in preprocess (vllm-project#26455) [Bugfix] Set the minimum python version for gpt-oss (vllm-project#26392) [Misc] Redact ray runtime env before logging (vllm-project#26302) Separate MLAAttention class from Attention (vllm-project#25103) [Attention] Register FLASHMLA_SPARSE (vllm-project#26441) [Kernels] Modular kernel refactor (vllm-project#24812) ...

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

…ect#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

wuxianyess · 2025-11-26T06:40:19Z

This issue still exists. Was this modification not merged into version 0.11.0?

TypeError: argument 'id': StreamInput must be either an integer or a list of integers

icecream0215 · 2025-11-27T11:47:18Z

这个问题仍然存在。这项修改是否已合并到 0.11.0 版本中？

TypeError：参数“id”：StreamInput 必须是整数或整数列表
I've encountered a similar error in my service. Could you share your request command so I can confirm whether we're facing the same issue?

…ect#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

njhill added the bug Something isn't working label Oct 8, 2025

njhill requested review from WoosukKwon, alexm-redhat, comaniac, robertgshaw2-redhat and ywang96 as code owners October 8, 2025 21:45

njhill requested a review from yewentao256 October 8, 2025 21:45

mergify bot added the v1 label Oct 8, 2025

njhill mentioned this pull request Oct 8, 2025

[Bug]: TypeError: argument 'id': StreamInput must be either an integer or a list of integers #26438

Closed

1 task

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

vllm/v1/engine/detokenizer.py Show resolved Hide resolved

change format specifier

d327da0

Signed-off-by: Nick Hill <nhill@redhat.com>

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 8, 2025

njhill mentioned this pull request Oct 8, 2025

[Bug]: Incremental detokenization error when running llama-3.3-70b-fp8 model #21951

Closed

yewentao256 approved these changes Oct 8, 2025

View reviewed changes

yewentao256 enabled auto-merge (squash) October 8, 2025 23:59

vllm-bot merged commit bb6d8c2 into vllm-project:main Oct 9, 2025
46 of 48 checks passed

njhill deleted the negative-tok-id branch October 9, 2025 04:26

Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025

[Bugfix] Catch and log invalid token ids in detokenizer vllm-project#2 (

df36514

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Bugfix] Catch and log invalid token ids in detokenizer vllm-project#2 (

1e126aa

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[Bugfix] Catch and log invalid token ids in detokenizer vllm-project#2 (

e517640

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

ppetrovicTT pushed a commit to tenstorrent/vllm that referenced this pull request Oct 27, 2025

[Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-proj…

7f461ff

…ect#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Bugfix] Catch and log invalid token ids in detokenizer vllm-project#2 (

c0bdebd

vllm-project#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-proj…

ecdc05d

…ect#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Catch and log invalid token ids in detokenizer #2#26445

[Bugfix] Catch and log invalid token ids in detokenizer #2#26445
vllm-bot merged 2 commits intovllm-project:mainfrom
njhill:negative-tok-id

njhill commented Oct 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

wuxianyess commented Nov 26, 2025

Uh oh!

icecream0215 commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

njhill commented Oct 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wuxianyess commented Nov 26, 2025

Uh oh!

icecream0215 commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

njhill commented Oct 8, 2025 •

edited by github-actions bot

Loading