Jonathan Frankle (@jefrankle) / X

Jonathan Frankle

3,982 posts

Jonathan Frankle

@jefrankle

Chief AI Scientist @databricks via MosaicML. e/brick

Joined December 2013

Jonathan Frankle
@jefrankle
Mar 27, 2024
Meet DBRX, a new sota open llm from @databricks. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.
965K
Jonathan Frankle
@jefrankle
May 7, 2020
I just open-sourced my codebase for research on neural network pruning, the Lottery Ticket Hypothesis, and other topics in deep learning. It's written in PyTorch and designed to make it easy to add new models, datasets, and experiments. Check it out:
GitHub - facebookresearch/open_lth: A repository in preparation for open-sourcing lottery ticket...
From github.com
Jonathan Frankle
@jefrankle
Mar 25, 2025
The hardest part about finetuning LLMs is that people generally don't have high-quality labeled data. Today, @databricks introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data.
91K
Jonathan Frankle
@jefrankle
May 5, 2023
MPT is here! Check out our shiny new LLMs, open-source w/commercial license. The base MPT-7B model is 7B params trained on 1T tokens and reaches LLaMA-7B quality. We also created Instruct (commercial), Chat, and (my favorite) StoryWriter-65k+ variants. 🧵
Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs | Databricks Blog
From databricks.com
284K
Jonathan Frankle
@jefrankle
Jun 22, 2023
MPT-30B is here! Same MPT architecture, 30B parameters, > 1T tokens, 8k context window, trained on H100s, great perf (esp on coding), single-GPU inference, commercially usable, and massively upgraded instruct and chat datasets. Take it for a spin! huggingface.co/spaces/mosaicm…
165K
Jonathan Frankle
@jefrankle
Dec 10, 2022
I defended today, and @mcarbin was kind enough to pass me. My favorite part of the thesis is a ground-up rewrite of the original Lottery Ticket Hypothesis paper with fresh data and a narrative that benefits from four years of hindsight/maturity. Coming soon to an arxiv near you!
Jonathan Frankle
@jefrankle
Apr 20, 2023
72 hrs ago, @togethercompute released the RedPajama dataset. Like everyone, we at @MosaicML were very excited about the idea of a fully open-source Llama. So excited, in fact, that we've already trained a 1B model on 200B tokens! It's on HF (Apache2) here:
huggingface.co
mosaicml/mpt-1b-redpajama-200b · Hugging Face
153K
Jonathan Frankle
@jefrankle
Jun 26, 2023
I'm absolutely thrilled that @MosaicML has agreed to join @databricks as we continue on our journey to make the latest advances deep learning efficient and accessible for everyone. The best of MosaicML is yet to come 🎉🎉🎉
Ali Ghodsi
@alighodsi
Jun 26, 2023
Big news: we've agreed to acquire @MosaicML, a leading generative AI platform. I couldn’t be more excited to join forces once the deal closes. databricks.com/mosaic-news
88K
Jonathan Frankle
@jefrankle
Sep 15, 2025
Five years ago, @NaveenGRao cold emailed me about starting a company. I knew nothing about startups, VC, products, or customers. My first PhD was in AI with @mcarbin. My second PhD was in startups with Naveen. I couldn't have asked for a better adviser on that journey.
Naveen Rao
@NaveenGRao
Sep 15, 2025
Today is my last day at @databricks . ~2.5 years ago @alighodsi told me his goal was to build a $100B company. Databricks was at a $38B valuation when MosaicML was acquired in July 2023 and just broke the $100B valuation number. It’s amazing to be part of this growth! And now AI
90K
Jonathan Frankle
@jefrankle
Feb 11, 2023
For those interested, my dissertation is now available. The highlight is that I re-did the original Lottery Ticket Hypothesis paper from scratch (Chapter 3). It follows the same path as the original, but with years of context/maturity + a new experiment 🧵 jfrankle.com/jfrankle-disse…
92K
Jonathan Frankle
@jefrankle
Oct 18, 2021
I guess the word is out! I'll be joining the @Harvard faculty in the fall of 2023 as part of an amazing cohort of new machine learning professors. Looking forward to sharing more about my lab, how to join, and everything we're building at @hseas when I'm a bit closer to arriving!
Boaz Barak
@boazbaraktcs
Oct 18, 2021
1/21 Banner year for Harvard CS! New hires include Sham Kakade @ShamKakade6 and Fernanda Viegas @viegasf (joining @wattenberg), as well as David Alvarez-Melis, Anurag Anshu @AnuragAnshu4, Sitan Chen, and Jonathan Frankle @jefrankle seas.harvard.edu/news/2021/10/s…
Jonathan Frankle
@jefrankle
Nov 23, 2020
Reviewer 3 has very strong opinions on BatchNorm.
Jonathan Frankle
@jefrankle
Mar 16, 2022
TLDR: Announcing 🌟COMPOSER🌟, a PyTorch trainer for efficient training *algorithmically*. Train 2x-4x faster on standard ML tasks, a taste of what's coming from @MosaicML. Star it, 𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 𝚖𝚘𝚜𝚊𝚒𝚌𝚖𝚕, contribute, be efficient! github.com/mosaicml/compo… Thread:
github.com
GitHub - mosaicml/composer: Supercharge Your Model Training
Supercharge Your Model Training. Contribute to mosaicml/composer development by creating an account on GitHub.
Jonathan Frankle
@jefrankle
Aug 30, 2024
Huge congrats to Dr. @sarahookr for defending her PhD! Sword will be in the mail shortly ⚔️
65K