| Backend |
Page Size > 1 |
Spec Decoding |
MLA |
Sliding Window |
Speed |
| FA3 |
✅ |
❌ |
✅ |
✅ |
⭐ ⭐ ⭐ ⭐ ⭐ |
| FlashInfer |
✅ |
✅ |
✅ |
✅ |
⭐ ⭐ ⭐ ⭐ ⭐ |
| FlashMLA |
✅ |
✅ |
✅ |
✅ |
⭐ ⭐ ⭐ ⭐ ⭐ |
| Triton |
✅ |
✅ |
✅ |
✅ |
⭐ ⭐ ⭐ |
| Torch Native |
✅ |
✅ |
✅ |
✅ |
⭐ |
I haven't verified the details, pls do benchmark and read the code and summarize it
I haven't verified the details, pls do benchmark and read the code and summarize it