Personal update: I’m excited to be joining @OpenAI!
It’s been a privilege to work on robotics at Google[x], Google Brain, and @GoogleDeepMind over the past 7 years.
Excited for a new adventure!
Sean Kirmani
173 posts
Research @OpenAI. Interested in intelligence, understanding, and science. Prev: @GoogleDeepMind, @TheTeamAtX
- We're hiring at Everyday Robots (everydayrobots.com)! My team is looking for 2023 interns to work on various computer vision and language problems. Come join to us to make robotics seem a little less "robotic". :) everydayrobots.com/join-us/job?id…
- 🤖🌎 We are organizing a workshop on Robotics World Modeling at @corl_conf 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io
- 🌎🌏🌍 We are organizing a workshop on Building Physically Plausible World Models at @icmlconf 2025! We have a great lineup of speakers, and are inviting you to submit your papers with a May 10 deadline. Website: physical-world-modeling.github.io
- Introducing Gemini Robotics! We show that we can improve Gemini to be better at embodied reasoning tasks and also show it can be a highly dexterous policy.
- Introducing Gen2Act! 🤖🎥 Off-the-shelf video generation models can provide zero-shot human demonstrations to control a robot. The visual world model can show how a human might do many different tasks, and we created a policy that can follow these generated video plans. 🧵👇Gen2Act: Casting language-conditioned manipulation as *human video generation* followed by *closed-loop policy execution conditioned on the generated video* enables solving diverse real-world tasks unseen in the robot dataset! homangab.github.io/gen2act/ 1/n
00:00 - This is a very nice article by @brondmo about our work at Everyday Robots. My time there was one of the most formative parts of my career. My major takeaway is that robots will be “boring” soon. The recent energy in Silicon Valley makes me optimistic.
- Some exciting work from our group @DeepMind! With an LLM as a reward generator and black-box optimization in MuJoCo, we can get robots to do some pretty interesting things!🤖Excited to share our project where we propose to use rewards represented in code as a flexible interface between LLMs and an optimization-based motion controller. website: language-to-reward.github.io Want to learn more about how we make a robot dog do moonwalk MJ style?🕺🕺
00:00 - Try out some new preliminary 3D vision capabilities in Gemini 2.0 that we've been cooking up! Really proud to be sharing this! aistudio.google.com/starter-apps/s…
00:00 - I am at @icmlconf! My co-organizers and I will be hosting a workshop on Building Physically Plausible World Models! Please reach out if you are here and would like to chat. physical-world-modeling.github.io
- Congrats to the Genie team! Video models and world models have made so much progress over the past year. It's really astounding Makes you wonder how this might change robotics too 🤔What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
00:00 - Introducing Generative Value Learning (GVL)! By unshuffling video frames in-context, we show you can construct a value function to measure task progress. This is useful for filtered BC, success detection, and evaluating dataset quality. Excellent project led by @JasonMa2020 🧵Excited to finally share Generative Value Learning (GVL), my @GoogleDeepMind project on extracting universal value functions from long-context VLMs via in-context learning! We discovered a simple method to generate zero-shot and few-shot values for 300+ robot tasks and 50+
00:00 - Very proud of this large-scale collaboration with the greater robotics research community! 🤖 🌎 Robot transformers improve with scaled robots, tasks, and environments. Most surprisingly, they yield impressive x-embodiment transfer. Check out the work at robotics-transformer-x.github.io.RT-X: generalist AI models lead to 50% improvement over RT-1 and 3x improvement over RT-2, our previous best models. 🔥🥳🧵 Project website: robotics-transformer-x.github.io
GIF - Excited to be sharing Gemini Robotics On-Device! We are releasing an SDK for trusted testers to use and fine-tune the model. We also have a simulator for people to play around with the ALOHA tabletop environment. github.com/google-deepmin… github.com/google-deepmin…We’re bringing powerful AI directly onto robots with Gemini Robotics On-Device. 🤖 It’s our first vision-language-action model to help make robots faster, highly efficient, and adaptable to new tasks and environments - without needing a constant internet connection. 🧵
00:00








