-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Closed
Labels
Description
Expected Behavior
Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and implementation in the spirit of FlashAttention.
Checklist
- I have completely filled out this template
- I have confirmed that this issue exists on the current
masterbranch - I have confirmed that this is not a duplicate issue by searching issues
- I have provided detailed steps to reproduce the issue