[BUGFIX] Replace assert with ValueError for response_format validation in chat completions endpoint by antonovsergey93 · Pull Request #35443 · vllm-project/vllm

antonovsergey93 · 2026-02-26T21:15:19Z

Purpose

When the /v1/chat/completions endpoint receives a request with response_format type json_schema but without the required json_schema field, the server crashes with an AssertionError, resulting in a 500 Internal Server Error.

Fixes #35438

This is the same class of issue addressed in #35456 for the /v1/completions endpoint

Test Plan

pytest tests/entrypoints/openai/test_chat_error.py -v

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Sergey Antonov <antonovsergey93@gmail.com>

github-actions · 2026-02-26T21:15:30Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request addresses a bug where an assert was used for validating the response_format for json_schema, which could lead to a 500 error. The change correctly replaces this with a ValueError, ensuring a proper 400 Bad Request is returned for invalid requests. The added test case verifies this behavior. I've added one suggestion to further improve the validation logic for json_schema to provide more specific error messages to the user.

vllm/entrypoints/openai/chat_completion/protocol.py

Signed-off-by: Sergey Antonov <antonovsergey93@gmail.com>

…letions endpoint When the completions endpoint receives a request with response_format type 'json_schema' but without the required json_schema field, the server crashes with an AssertionError resulting in a 500 Internal Server Error. This is the same issue fixed for chat completions in vllm-project#35443, but for the /v1/completions endpoint. Replace assert statements with explicit ValueError raises so that the error is caught by create_error_response and returned as a proper 400 Bad Request. Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>

…ion (vllm-project#34687) Signed-off-by: Andrii <askliar@nvidia.com> Co-authored-by: Andrii <askliar@nvidia.com>

Signed-off-by: Roi Koren <roik@nvidia.com>

…llm-project#34274)

vllm-project#35184) Signed-off-by: Daniel Salib <danielsalib@meta.com>

Signed-off-by: angelayi <yiangela7@gmail.com>

…3012) Signed-off-by: Chenyaaang <chenyangli@google.com>

…m-project#35400) Signed-off-by: NickLucche <nlucches@redhat.com>

…t#35413) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

Signed-off-by: gnovack <gnovack@amazon.com>

Signed-off-by: yewentao256 <zhyanwentao@126.com>

Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

…lm-project#35424) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>

…5369) Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

…n in completions endpoint (vllm-project#35456) Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>

Signed-off-by: Max Hu <maxhu@nvidia.com> Signed-off-by: Max Hu <hyoung2991@gmail.com> Co-authored-by: Max Hu <maxhu@nvidia.com> Co-authored-by: Shang Wang <shangw@nvidia.com>

) Signed-off-by: tibG <naps@qubes.milou> Co-authored-by: tibG <naps@qubes.milou>

…parallelism (vllm-project#35410) Signed-off-by: jasonlizhengjian <jasonlizhengjian@gmail.com>

Signed-off-by: Sergey Antonov <antonovsergey93@gmail.com>

mergify · 2026-02-27T13:54:50Z

Documentation preview: https://vllm--35443.org.readthedocs.build/en/35443/

antonovsergey93 · 2026-02-27T13:59:45Z

Closing PR in favor of #35514

Change assert to ValueError in response_format json_schema validation

7b286de

Signed-off-by: Sergey Antonov <antonovsergey93@gmail.com>

antonovsergey93 requested review from DarkLight1337, NickLucche, aarnphm, chaunceyjiang, robertgshaw2-redhat and russellb as code owners February 26, 2026 21:15

mergify bot added frontend bug Something isn't working labels Feb 26, 2026

gemini-code-assist bot reviewed Feb 26, 2026

View reviewed changes

vllm/entrypoints/openai/chat_completion/protocol.py Show resolved Hide resolved

antonovsergey93 added 2 commits February 27, 2026 00:55

Cover missing json_schema.schema

74b8ab4

Signed-off-by: Sergey Antonov <antonovsergey93@gmail.com>

Merge branch 'main' into fix-assert-error-json-schema

bd2029a

umut-polat mentioned this pull request Feb 26, 2026

[Bugfix] Replace assert with ValueError for response_format validation in completions endpoint #35456

Merged

antonovsergey93 changed the title ~~[BUGFIX] Change assert to ValueError in response_format json_schema validation~~ [BUGFIX] Replace assert with ValueError for response_format validation in chat completions endpoint Feb 27, 2026

umut-polat mentioned this pull request Feb 27, 2026

[Bugfix] Move chat completion response_format validation to Pydantic model_validator #35510

Merged

askliar and others added 13 commits February 27, 2026 14:54

[Update] Use FlashInfer fast_decode_plan directly instead of replicat…

5bd8446

…ion (vllm-project#34687) Signed-off-by: Andrii <askliar@nvidia.com> Co-authored-by: Andrii <askliar@nvidia.com>

[Performance] Cublas Bf16 Gate with Fp32 Output (vllm-project#35121)

5e870c3

Signed-off-by: Roi Koren <roik@nvidia.com>

[CI] Actually run tests/kernels/quantization/test_block_fp8.py in CI (v…

9e9268f

…llm-project#34274)

[Bugfix] Emit reasoning_part events in simple streaming path for Resp… (

39ef642

vllm-project#35184) Signed-off-by: Daniel Salib <danielsalib@meta.com>

[compile] Invalidate cache for cpu flags (vllm-project#35119)

4cede91

Signed-off-by: angelayi <yiangela7@gmail.com>

[Core]Extract is_last_rank in Ray for tpu to override (vllm-project#3…

8cb8e6d

…3012) Signed-off-by: Chenyaaang <chenyangli@google.com>

[Misc] Move GPUModelRunner.prepare_kernel_block_sizes to utils (vll…

bea08f9

…m-project#35400) Signed-off-by: NickLucche <nlucches@redhat.com>

[Bugfix] Fix Qwen3NextForCausalLM packed_modules_mapping (vllm-projec…

8c8d151

…t#35413) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

use 'max_active_experts' for moe lora input size (vllm-project#33197)

c34752d

Signed-off-by: gnovack <gnovack@amazon.com>

[Bug] Fix outdated links in source code (vllm-project#35314)

8192b15

Signed-off-by: yewentao256 <zhyanwentao@126.com>

[BugFix] Repo utils debug print patch (vllm-project#35434)

23246ec

Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

[Bugfix] disable allreduce_rms_fusion by default when pp size > 1 (vl…

6ff9a50

…lm-project#35424) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>

[Bug] correct out dtype of rms_norm_gated native path (vllm-project#3…

f3a7910

…5369) Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>

umut-polat and others added 5 commits February 27, 2026 14:54

[Bugfix] Replace assert with ValueError for response_format validatio…

a7a2b4f

…n in completions endpoint (vllm-project#35456) Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>

Flashinfer cuDNN backend for Qwen3 VL ViT attention (vllm-project#34580)

75661fb

Signed-off-by: Max Hu <maxhu@nvidia.com> Signed-off-by: Max Hu <hyoung2991@gmail.com> Co-authored-by: Max Hu <maxhu@nvidia.com> Co-authored-by: Shang Wang <shangw@nvidia.com>

[Bugfix] Add missing activation attr to RMSNormGated (vllm-project#35423

620548d

) Signed-off-by: tibG <naps@qubes.milou> Co-authored-by: tibG <naps@qubes.milou>

[compile] Cleanup: Remove unnecessary +rms_norm forcing for sequence …

9afd79b

…parallelism (vllm-project#35410) Signed-off-by: jasonlizhengjian <jasonlizhengjian@gmail.com>

Fixes after vllm-project#35456

a6ef2e8

Signed-off-by: Sergey Antonov <antonovsergey93@gmail.com>

antonovsergey93 requested review from LucasWilkinson, MatthewBonanni, ProExpertProg, WoosukKwon, hmellor, houseroad, jeejeelee, mgoin, pavanimajety, sighingnow, tlrmchlsmth, yewentao256, youkaichao and zou3519 as code owners February 27, 2026 13:54

mergify bot added documentation Improvements or additions to documentation ci/build deepseek Related to DeepSeek models qwen Related to Qwen models nvidia v1 labels Feb 27, 2026

github-project-automation bot added this to NVIDIA Feb 27, 2026

antonovsergey93 closed this Feb 27, 2026

github-project-automation bot moved this to Done in NVIDIA Feb 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUGFIX] Replace assert with ValueError for response_format validation in chat completions endpoint#35443

[BUGFIX] Replace assert with ValueError for response_format validation in chat completions endpoint#35443
antonovsergey93 wants to merge 23 commits intovllm-project:mainfrom
antonovsergey93:fix-assert-error-json-schema

antonovsergey93 commented Feb 26, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Feb 27, 2026

Uh oh!

antonovsergey93 commented Feb 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Uh oh!

Conversation

antonovsergey93 commented Feb 26, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Feb 27, 2026

Uh oh!

antonovsergey93 commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

antonovsergey93 commented Feb 26, 2026 •

edited by github-actions bot

Loading

antonovsergey93 commented Feb 27, 2026 •

edited

Loading