Skip to content

[Model Runner V2] Fix v2 AttributeError: 'CohereASRDecoder' object has no attribute 'embed_input_ids'#44568

Merged
WoosukKwon merged 3 commits into
mainfrom
wentao-fix-v2-CohereASRDecoder
Jun 10, 2026
Merged

[Model Runner V2] Fix v2 AttributeError: 'CohereASRDecoder' object has no attribute 'embed_input_ids'#44568
WoosukKwon merged 3 commits into
mainfrom
wentao-fix-v2-CohereASRDecoder

Conversation

@yewentao256

Copy link
Copy Markdown
Member

Purpose

Fixing for #44443

VLLM_USE_V2_MODEL_RUNNER=1 pytest tests/models/test_initialization.py::test_can_initialize_large_subset[CohereAsrForConditionalGeneration]

Originally

(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/executor/uniproc_executor.py", line 92, in collective_rpc
(EngineCore pid=1833568)     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore pid=1833568)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/serial_utils.py", line 510, in run_method
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu_worker.py", line 400, in determine_available_memory
(EngineCore pid=1833568)     self.model_runner.profile_run()
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_runner.py", line 629, in profile_run
(EngineCore pid=1833568)     hidden_states, sample_hidden_states = self._dummy_run(
(EngineCore pid=1833568)                                           ^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/eplb_utils.py", line 37, in wrapper
(EngineCore pid=1833568)     result = fn(self, *args, **kwargs)
(EngineCore pid=1833568)              ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_runner.py", line 539, in _dummy_run
(EngineCore pid=1833568)     self.execute_model(
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_runner.py", line 1210, in execute_model
(EngineCore pid=1833568)     inputs_embeds = self.model_state.get_mm_embeddings(
(EngineCore pid=1833568)                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/model_states/default.py", line 126, in get_mm_embeddings
(EngineCore pid=1833568)     inputs_embeds = self.encoder_runner.get_inputs_embeds(
(EngineCore pid=1833568)                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(EngineCore pid=1833568)     return func(*args, **kwargs)
(EngineCore pid=1833568)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu/mm/encoder_runner.py", line 143, in get_inputs_embeds
(EngineCore pid=1833568)     x = self.model.embed_input_ids(
(EngineCore pid=1833568)         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/vllm-source/vllm/model_executor/models/interfaces.py", line 398, in embed_input_ids
(EngineCore pid=1833568)     self.get_language_model().embed_input_ids,
(EngineCore pid=1833568)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=1833568)   File "/home/yewentao256/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1968, in __getattr__
(EngineCore pid=1833568)     raise AttributeError(
(EngineCore pid=1833568) AttributeError: 'CohereASRDecoder' object has no attribute 'embed_input_ids'

Now

====================== 1 passed, 17 warnings in 72.84s (0:01:12) ======================

Signed-off-by: yewentao256 <zhyanwentao@126.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 4, 2026
@mergify mergify Bot added the v1 label Jun 4, 2026

@njhill njhill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yewentao256

Comment on lines +17 to +18
"WhisperForConditionalGeneration" in vllm_config.model_config.architectures
or "CohereAsrForConditionalGeneration" in vllm_config.model_config.architectures

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other models in this category that we should add?

Do we any other way to determine this from the model and/or config?

@yewentao256 yewentao256 Jun 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not, took a look and vllm/model_executor/models/funasr.py vllm/model_executor/models/fireredasr2.py vllm/model_executor/models/fireredlid.py they all have embed_input_ids(), I don't find other models missing

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we check using supports_transcription_only instead, may influence other models, so I think current fix is minimal and precise.

@njhill njhill enabled auto-merge (squash) June 9, 2026 17:03
@WoosukKwon WoosukKwon disabled auto-merge June 10, 2026 18:05
@WoosukKwon WoosukKwon merged commit ffce72c into main Jun 10, 2026
75 of 78 checks passed
@WoosukKwon WoosukKwon deleted the wentao-fix-v2-CohereASRDecoder branch June 10, 2026 18:06
wcynb1023 pushed a commit to wcynb1023/vllm that referenced this pull request Jun 11, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Saddss pushed a commit to Saddss/vllm that referenced this pull request Jun 14, 2026
…as no attribute 'embed_input_ids'` (vllm-project#44568)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants