Pinned
Brian Hie
1,778 posts
- We trained a genomic language model on all observed evolution, which we are calling Evo 2. The model achieves an unprecedented breadth in capabilities, enabling prediction and design tasks from molecular to genome scale and across all three domains of life.
- Welcome to the age of generative genome design! In 1977, Sanger et al. sequenced the first genome—of phage ΦX174. Today, led by @samuelhking, we report the first AI-generated genomes. Using ΦX174 as a template, we made novel, high-fitness phages with genome language models. 🧵
- In some new work, we describe how generative machine learning enables the modular design of complex proteins controlled by a high-level programming language for protein design 📄Link to paper: biorxiv.org/content/10.110… (1/N)
- Today in @ScienceMagazine, we report how a structure-informed language model can guide antibody evolution with unprecedented efficiency. Led by @varunrshanker, we coevolved mAbs to overcome viral escape, laying the groundwork for more evolutionarily resilient therapeutic design.
GIF - Today, we report Germinal, a method for efficient de novo antibody design, with @santimillef and @SynBioGaoLab. Germinal achieves success rates of 4-22% across diverse epitopes. We make the work fully open, without doing lame things like posting a preprint without methods. 🧵
- In some new work (the first from the new lab!), we lay out a vision for a biological foundation model that unites DNA, RNA, and protein modalities and operates at molecular, systems, and genome levels of scale. Blog: arcinstitute.org/news/blog/evo Preprint: arcinstitute.org/manuscripts/Evo
- Delighted to share that I will be starting a research laboratory as a professor at @StanfordEng ChemE and @StanfordData, in collaboration with @arcinstitute. The lab will work on aligning biological AI with human good.
- In new work led by @aditimerch with @samuelhking, we prompt engineer Evo to perform function-guided protein design with high experimental success rates, including designs that go beyond natural sequences. We also release SynGenome, the first AI-generated genomics database. 🧵 1/N
GIF - We are actively recruiting for two positions at the interface between biology and generative design. Backgrounds of particular interest are in protein biochemistry/evolution and synthetic genomics/biology. Please consider joining us! 1/n
- The evolutionary velocity paper ended on a cliffhanger: protein language models could predict evolution retrospectively, but could they also run evolution forward to prospectively design new proteins? So, I retrained as a protein biochemist to find out...
- The Evo 1 paper is now published!
GIFA new Science study presents “Evo”—a machine learning model capable of decoding and designing DNA, RNA, and protein sequences, from molecular to genome scale, with unparalleled accuracy. Evo’s ability to predict, generate, and engineer entire genomic sequences could change the - We may have been making a lot of progress in genome design, but we are not done with protein design just yet lol
- In some fun recent work with @KevinKaichuang and Peter Kim, we show that by using masked language models to predict local mutational effects, we can construct an evolutionary "vector field" -- kind of like RNA velocity, but for protein evolution! biorxiv.org/content/10.110…











