Bump flashinfer version to 0.6.7#38188
Conversation
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request updates the FlashInfer library version from 0.6.6 to 0.6.7 across Dockerfiles, version configuration, and Python requirements. A review comment suggests re-evaluating the version constraint for the transitive dependency nvidia-cudnn-frontend to ensure compatibility with FlashInfer 0.6.7 and prevent potential build or runtime issues.
| flashinfer-python==0.6.7 | ||
| flashinfer-cubin==0.6.7 |
There was a problem hiding this comment.
This change updates flashinfer to 0.6.7, but does not update the version constraint for its transitive dependency nvidia-cudnn-frontend on line 16. The existing cap <1.19.0 was likely added for a previous version of flashinfer and may be incompatible with 0.6.7, potentially causing build failures or runtime errors. This constraint should be re-evaluated based on the requirements of flashinfer==0.6.7.
|
@yewentao256 checking right now. |
|
@yewentao256 Seems there are issues with the new version. I tried the nemotron locally on GB300 and it produces repetitive output: |
Yeah Please take a further look, we need to solve them before merging this PR |
|
@yewentao256 Yes of course. |
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
894a10e to
456be52
Compare
Signed-off-by: wzhao18 <wzhao18.sz@gmail.com>
|
@yewentao256 Fixed the LM eval Ci failures - problem related to routing bias in trtllm-gen MoE kernels. The rest CI failures also failed in main branch. buildkite/ci/pr/model-runner-v2-distributed-2-gpus seems unrelated/flaky. |
yewentao256
left a comment
There was a problem hiding this comment.
LGTM, thanks for the work!
Also CC @mgoin
|
@robertgshaw2-redhat Seems we need to add the casting for routing bias back. |
|
The routing bias cast is added because the CI showed GSM8k accuracy collapse with |
|
I'm planning to get this change in #38423 |
|
@mgoin Sounds good. Will close this one. |
Purpose
Bump flashinfer version to 0.6.7
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.