Change warning logs to debug for unimplemented MXFP4 Linear/Attention#29441
Change warning logs to debug for unimplemented MXFP4 Linear/Attention#29441yewentao256 merged 4 commits intomainfrom
Conversation
Signed-off-by: Michael Goin <mgoin64@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request aims to reduce log noise by changing warning messages about unimplemented MXFP4 layers to debug messages. While reducing log spam is a good goal, changing these specific warnings to debug messages could hide critical information from users. If a user expects a layer to be quantized and it silently falls back to an unquantized version, it could lead to unexpected behavior and difficult-to-debug performance or memory issues. I've suggested changing the log level to info instead of debug. This would make the messages less alarming than warnings but still ensure they are visible to users by default, providing a good balance between reducing noise and keeping users informed.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
yewentao256
left a comment
There was a problem hiding this comment.
Nice find, a small comment
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: Michael Goin <mgoin64@gmail.com>
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
…vllm-project#29441) Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
…vllm-project#29441) Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
…vllm-project#29441) Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Purpose
These warnings are too loud for the user and confusing since gpt-oss doesn't have these layers quantized
Not sure why they were added in #27334 as unfinished
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.