Skip to content

Library Request: mamba #7629

@AlexCatarino

Description

@AlexCatarino

Expected Behavior

Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and implementation in the spirit of FlashAttention.

Checklist

  • I have completely filled out this template
  • I have confirmed that this issue exists on the current master branch
  • I have confirmed that this is not a duplicate issue by searching issues
  • I have provided detailed steps to reproduce the issue

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions