Pete Florence (@peteflorence) / X

Pete Florence

681 posts

Pete Florence

@peteflorence

Co-Founder & CEO @GeneralistAI

San Francisco, CA

Joined May 2012

Pinned
Pete Florence
@peteflorence
Apr 7
Article
Going Beyond World Models & VLAs
In GEN-1, approximately 99% of the parameters are trained from scratch. Previously, this might be considered wild. For Generalist, it’s a deliberate choice. It follows our strong conviction — pursued...
329K
Pete Florence
@peteflorence
Oct 13, 2022
"Interactive Language: Talking to Robots in Real Time" interactive-language.github.io - Real-time, interactive, open-vocabulary, language+pixels -> actions - A new scale (~600,000 traj.) for language-conditioned behavior - Dataset, sim, models, code all to be released! (1/n)...
00:00
Pete Florence
@peteflorence
Jan 17, 2019
Excited to share some work with colleagues last summer at Facebook Reality Labs that is now up on arXiv! “DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation” Here’s a snippet of some fun interpolations in the shape latent space.
00:00
Pete Florence
@peteflorence
Oct 28, 2021
It may be time to settle this. Poll in next tweet.
Pete Florence
@peteflorence
Jun 17, 2025
Last Spring I took off from Google DeepMind, and I've been heads-down building since with an amazing team. Excited to share more today -- introducing Generalist. It's felt to me for a couple years, since we started bringing multimodal LLMs into robotics, that a subset of the
Generalist
@GeneralistAI
Jun 17, 2025
Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early
00:00
37K
Pete Florence
@peteflorence
Mar 10, 2023
Today we share more on PaLM-E! (palm-e.github.io) Thread 🧵with blog post link at the end. PaLM-E can do a lot of things across robotics, vision, and language… but let’s look at a few capabilities in detail, step by step 😉 👇
Danny Driess
@DannyDriess
Mar 7, 2023
What happens when we train the largest vision-language model and add in robot experiences? The result is PaLM-E 🌴🤖, a 562-billion parameter, general-purpose, embodied visual-language generalist - across robotics, vision, and language. Website: palm-e.github.io
00:00
PaLM-E: An Embodied Multimodal Language Model
From palm-e.github.io
103K
Pete Florence
@peteflorence
Jun 28, 2020
From Scott Kuindersma's @BostonDynamics talk on Friday -- Atlas jumping between boxes now with computer vision in the loop. From robotics today seminar -- see @RoboticsSeminar or roboticstoday.github.io for more.
00:00
Pete Florence
@peteflorence
Nov 8, 2021
Excited to share more about our "Implicit Behavioral Cloning" work! ✅*code* just released: github.com/google-researc… ✅*videos*: implicitbc.github.io Will be sharing more this week at #CoRL2021. I'll also maybe write a TL;DR thread soon, meanwhile, check out the website!
00:00
Pete Florence
@peteflorence
Apr 7, 2022
You may have seen this week some pretty powerful large "foundational" models. (i.e., PaLM, DALLE-2). With "Socratic Models" we look into combining such models... composing them zero-shot to do various new tasks, including across modalities. A couple more thoughts below 🧵
Andy Zeng
@andyzengineer
Apr 7, 2022
With multiple foundation models “talking to each other”, we can combine commonsense across domains, to do multimodal tasks like zero-shot video Q&A or image captioning, no finetuning needed. Socratic Models: website + code: socraticmodels.github.io paper: arxiv.org/abs/2204.00598
00:00
Pete Florence
@peteflorence
Aug 2, 2023
A comparison of the largest model sizes used for real-robot control:
37K
Pete Florence
@peteflorence
Sep 15, 2020
Can robots model the world with keypoints, and learn how to see, predict, and control them into the future? "Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning" @lucas_manuelli, @YunzhuLiYZ, me, @rtedrake arxiv.org/pdf/2009.05085… (1/n)
00:00
Pete Florence
@peteflorence
Mar 10, 2022
TL:DR: “How can NeRF be useful for robotics?” One option: train precise correspondence models, made possible by generating training data from NeRF’s beautiful geometry. @yen_chen_lin did an amazing job leading this project.
Yen-Chen Lin
@yen_chen_lin
Mar 4, 2022
Hi everyone, I'm happy to share our new #ICRA2022 paper on 𝐦𝐚𝐤𝐢𝐧𝐠 𝐍𝐞𝐑𝐅 𝐮𝐬𝐞𝐟𝐮𝐥 𝐟𝐨𝐫 𝐫𝐨𝐛𝐨𝐭𝐬! NeRF-Supervision is a method that learns dense visual descriptors from NeRF for category-level robotic pick and place. yenchenlin.me/nerf-supervisi…
00:00
Pete Florence
@peteflorence
Mar 11, 2022
Very nice real-time reactive robot manipulation demo from @MarcToussaint17's group.
00:00
Marc Toussaint
@Marc__Toussaint
Mar 11, 2022
Finally a step from Logic-Geometric Programming to a reactive robotic manipulation framework: "Sequence-of-Constraints MPC: Reactive Timing-Optimal Control of Sequential Manipulation" Paper & Videos: user.tu-berlin.de/mtoussai/22-Se… Thanks to all collaborators! @DannyDriess
Pete Florence
@peteflorence
Oct 26, 2021
New xArm robot (the "Lite 6"), and they're selling some for $1,199. kickstarter.com/projects/ufact… I've really enjoyed using the bigger xArm 6 for robot research. They're simple but pretty high quality for the price point. Exciting to see prices jump even lower.
GIF