Excited to announce that with a few incredible folks, we're starting a company -- Physical Intelligence (Pi, π, @physical_int). We're focused on bringing the amazing recent breakthroughs of AI and foundation models into the physical world.
Brian Ichter
42 posts
San Francisco, CA
Joined November 2014
- How do you get zero-shot robot control from VLMs? Introducing Prompting with Iterative Visual Optimization, or PIVOT! It casts spatial reasoning tasks as VQA by visually annotating images, which VLMs can understand and answer. Project website: pivot-prompt.github.io
- Really excited about our new work for controlling robots with LLMs! "Language to Rewards" uses an LLM to generate rewards to optimize over with MuJoCo MPC. This flexible framework enables general behaviors, even moonwalking... See @xf1280's 🧵 and site language-to-reward.github.io.
GIF
00:09🤖Excited to share our project where we propose to use rewards represented in code as a flexible interface between LLMs and an optimization-based motion controller. website: language-to-reward.github.io Want to learn more about how we make a robot dog do moonwalk MJ style?🕺🕺 - Excited to show off Chain of Code! A "best of both worlds" between coding and language models for reasoning. Given a question it writes pseudocode then executes it with Python or emulates it with a LM (an LMulator 😜). BBH results pictured. Paper arxiv.org/pdf/2312.04474… and 🧵👇
We are excited to announce Chain of Code (CoC), a simple yet surprisingly effective method that improves Language Model code-driven reasoning. On BIG-Bench Hard, CoC achieves 84%, a gain of 12% over Chain of Thought. Website: chain-of-code.github.io Paper: arxiv.org/pdf/2312.04474… - Excited to share PaLM-E 🌴🤖! An embodied visual-language model capable of visual, language, and robotics. See a deeper thread from Dannny 👇
GIF
00:42What happens when we train the largest vision-language model and add in robot experiences? The result is PaLM-E 🌴🤖, a 562-billion parameter, general-purpose, embodied visual-language generalist - across robotics, vision, and language. Website: palm-e.github.io - Our AURO special issue on LLMs in robotics is out now: link.springer.com/collections/be…! Features eight papers using LLMs for robotics -- applied to chemistry, robot memories, anomaly detection, policies, and task planning. As well as works like ProgPrompt, Text2Motion, and TidyBot.
- Today we announced RT-2! A vision-language-action model, that transfers internet-scale knowledge directly into robot movements. The key is that it’s all just tokens! It means it understands "pick up 🦕" or even the below “pick up the extinct animal”: See @hausman_k's
00:20PaLM-E or GPT-4 can speak in many languages and understand images. What if they could speak robot actions? Introducing RT-2: robotics-transformer2.github.io our new model that uses a VLM (up to 55B params) backbone and fine-tunes it to directly output robot actions! - We just announced RT-X, a project with a huge number of academic collaborators... and growing! (reach out to [email protected] to get involved) The data and code is also open-sourced, and the results so far are 🔥 with gains on 6 different robots. See 🧵👇RT-X: generalist AI models lead to 50% improvement over RT-1 and 3x improvement over RT-2, our previous best models. 🔥🥳🧵 Project website: robotics-transformer-x.github.io
GIF - Replying to @brian_ichterand a bonus video we might have gotten a little carried away with 😂
00:00 - Replying to @brian_ichterI’m also super excited about the HuggingFace 🤗 demo that allows you to upload your own image and questions and see how PIVOT works. huggingface.co/spaces/pivot-p…
GIF - Replying to @brian_ichterThere will be a ton of challenges along the way, and if you're interested in solving them please get in touch at [email protected] or read more in @ashleevance's article bloomberg.com/news/articles/….
- Replying to @brian_ichterWe're developing robotic foundation models capable of doing any task, anywhere. Too many exciting developments to list, but first and foremost is the incredible team of @chelseabfinn, @hausman_k, @lachygroom, @svlevine, @QuanVng, @SurajNair_1, and a few more to be announced.
- Replying to @brian_ichterNicolas Heess, @Chelseabfinn, and @svlevine. Project website: pivot-prompt.github.io Paper: pivot-prompt.github.io/assets/pivot.p…
- Replying to @brian_ichterProud to work with a great team of @snasiriany, @xf1280, @Stacormed, @xiao_ted, @jackyliang42, Ishita Dasgupta, Annie Xie, @DannyDriess, @ayzwah, @drzhuoxu, @QuanVng, Tingnan Zhang, Tsang-Wei Edward Lee, @kuanghueilee, Peng Xu, @SeanKirmani, @yukez, @andyzeng_, @hausman_k,









