Skip to content
This repository was archived by the owner on Mar 3, 2026. It is now read-only.
Back to Milestones

Good quality Llama 3.1 8B and 70B in torch_xla_models

Open
Overdue by 1 year(s)
Due by January 30, 2025
Last updated Jun 7, 2025

The goal of this milestone is to ensure we can replace the hard-to-understand Llama reference implementation in https://github.com/pytorch-tpu/transformers/tree/flash_attention. The branch of the Huggingface fork is not ideal for engineering and for showing to interested users.

85% complete

List view

    There are no open issues in this milestone

    Add issues to milestones to help organize your work for a particular release or project. Find and add issues with no milestones in this repo.