Skip to content

[diffusion] model: Properly validate device for Mistral 3 attention#22690

Merged
HaiShaw merged 3 commits intosgl-project:mainfrom
avjves:fix/mistral_cudnn_sdpa
Apr 16, 2026
Merged

[diffusion] model: Properly validate device for Mistral 3 attention#22690
HaiShaw merged 3 commits intosgl-project:mainfrom
avjves:fix/mistral_cudnn_sdpa

Conversation

@avjves
Copy link
Copy Markdown
Contributor

@avjves avjves commented Apr 13, 2026

Motivation

PR #22423 changed it so that Mistral 3 (Used in Flux2) uses cuDNN attention by default, if the device type is cuda. AMD HW however also reports the device type as cuda, but does not support cuDNN attention. This change broke AMD support for Flux2.

Modifications

  • Changes the check in Mistral 3 to make the decision to use cuDNN be also based on the current detected platform, not just the device of the tensor.

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@github-actions github-actions Bot added the diffusion SGLang Diffusion label Apr 13, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Mistral 3 encoder to utilize current_platform.is_cuda() for hardware detection. A review comment points out that removing the explicit tensor device type check could incorrectly trigger the CUDA backend for CPU-resident tensors, suggesting a combined check instead.

Comment thread python/sglang/multimodal_gen/runtime/models/encoders/mistral_3.py Outdated
avjves and others added 2 commits April 13, 2026 15:08
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@yhyang201
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@yhyang201
Copy link
Copy Markdown
Collaborator

@mickqian All CI (Nvidia + AMD) passed and PR is approved, ready for merge

— SGLDHelper bot

@HaiShaw HaiShaw merged commit aaa6823 into sgl-project:main Apr 16, 2026
148 of 158 checks passed
@avjves avjves deleted the fix/mistral_cudnn_sdpa branch April 17, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants