This repository was archived by the owner on Mar 3, 2026. It is now read-only.
OpenJun 7, 2025
Overdue by 1 year(s)
•Due by January 30, 2025
•Last updated The goal of this milestone is to ensure we can replace the hard-to-understand Llama reference implementation in https://github.com/pytorch-tpu/transformers/tree/flash_attention. The branch of the Huggingface fork is not ideal for engineering and for showing to interested users.
85% complete
List view
0 issues of 0 selected
There are no open issues in this milestone
Add issues to milestones to help organize your work for a particular release or project. Find and add issues with no milestones in this repo.