Skip to content

Change warning logs to debug for unimplemented MXFP4 Linear/Attention#29441

Merged
yewentao256 merged 4 commits intomainfrom
mxfp4-warn-debug
Nov 25, 2025
Merged

Change warning logs to debug for unimplemented MXFP4 Linear/Attention#29441
yewentao256 merged 4 commits intomainfrom
mxfp4-warn-debug

Conversation

@mgoin
Copy link
Copy Markdown
Member

@mgoin mgoin commented Nov 25, 2025

Purpose

These warnings are too loud for the user and confusing since gpt-oss doesn't have these layers quantized

(Worker_TP0 pid=3982331) INFO 11-25 20:04:34 [gpu_model_runner.py:3377] Starting to load model openai/gpt-oss-120b...
(Worker_TP1 pid=3982332) WARNING 11-25 20:04:34 [mxfp4.py:199] MXFP4 linear layer is not implemented - falling back to UnquantizedLinearMethod.
(Worker_TP1 pid=3982332) WARNING 11-25 20:04:34 [mxfp4.py:211] MXFP4 attention layer is not implemented. Skipping quantization for this layer.
(Worker_TP0 pid=3982331) WARNING 11-25 20:04:34 [mxfp4.py:199] MXFP4 linear layer is not implemented - falling back to UnquantizedLinearMethod.
(Worker_TP0 pid=3982331) WARNING 11-25 20:04:34 [mxfp4.py:211] MXFP4 attention layer is not implemented. Skipping quantization for this layer.

Not sure why they were added in #27334 as unfinished

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Michael Goin <mgoin64@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to reduce log noise by changing warning messages about unimplemented MXFP4 layers to debug messages. While reducing log spam is a good goal, changing these specific warnings to debug messages could hide critical information from users. If a user expects a layer to be quantized and it silently falls back to an unquantized version, it could lead to unexpected behavior and difficult-to-debug performance or memory issues. I've suggested changing the log level to info instead of debug. This would make the messages less alarming than warnings but still ensure they are visible to users by default, providing a good balance between reducing noise and keeping users informed.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find, a small comment

mgoin and others added 2 commits November 25, 2025 15:25
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
@mgoin mgoin added ready ONLY add when PR is ready to merge/full CI is needed startup-ux labels Nov 25, 2025
Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

Signed-off-by: Michael Goin <mgoin64@gmail.com>
@yewentao256 yewentao256 enabled auto-merge (squash) November 25, 2025 20:46
@yewentao256 yewentao256 merged commit 7df0289 into main Nov 25, 2025
56 checks passed
@yewentao256 yewentao256 deleted the mxfp4-warn-debug branch November 25, 2025 22:52
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
…vllm-project#29441)

Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
kitaekatt pushed a commit to kitaekatt/vllm that referenced this pull request Dec 1, 2025
…vllm-project#29441)

Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…vllm-project#29441)

Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed startup-ux

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants