fix: Add cutlass as an mm_fp4 backend in compute capability 12.0 in benchmark code#1959
Conversation
WalkthroughThe pull request updates the backend support configuration for the mm_fp4 routine at compute capability 12.0, expanding supported backends from cudnn only to include both cudnn and cutlass in the benchmarking utilities. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🔇 Additional comments (1)
Comment |
📌 Description
Previously
backend='cutlass'was not available to be benchmarked inflashinfer_benchmark.pyfor compute capability 12.0 while the kernel actually has been available. Current PR marks the backend as available.Example output of being runnable after PR:
🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
Summary by CodeRabbit