TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/
One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the
Russ Tedrake
29 posts
Professor at MIT, studying robotics. Vice President of Robotics Research, Toyota Research Institute.
Joined July 2022
- I'm super excited to start a great new collaboration with the fantastic team at Boston Dynamics. Scott Kuindersma and I chatted with Evan Ackerman about it earlier today.
- Very proud of Nicholas, who recently shared scalable-real2sim.github.io (for physics-quality assets from a small amount of interaction with a robot) and is now following up with his work on scene-level generation.Want to scale robot data with simulation, but don’t know how to get large numbers of realistic, diverse, and task-relevant scenes? Our solution: ➊ Pretrain on broad procedural scene data ➋ Steer generation toward downstream objectives 🌐 steerable-scene-generation.github.io 🧵1/8
00:00scalable-real2sim.github.ioScalable Real2Sim: Physics-Aware Asset Generation Via Robotic Pick-and-Place SetupsA fully automated pipeline that generates simulation-ready assets for real-world objects—no manual intervention required! - This work really sharpened my thinking about sim+real cotraining.Learning from both sim+real data could scale robot imitation learning. But what are the scaling laws & principles of sim+real cotraining? We study this in the first focused analysis of sim+real cotraining spanning 250+ policies & 40k+ evals arxiv.org/abs/2503.22634 (1/6)
- Replying to @RussTedrakeProbably my favorite plot from the paper, which sums it all up, is this one. The plot compares performance using different amounts of pretraining data used before training a new task: 0% (aka single task), 25, 50, or 100% of TRI’s data, then 100% of TRI’s data + all of the
- Replying to @RussTedrakeThe short version is: LBMs work! We see consistent and statistically significant improvements as we increase the amount of pretraining data. But doing the science is still hard; as a field we have more work to do to improve the statistical power of our experiments.
- Replying to @RussTedrakeSide note: I'm proud of the title of this paper, which we intentionally made pretty narrow/specific. I think that some of the most important work that we have to do as a field right now is careful empirical work to interrogate the properties of these models that we're creating.
- Replying to @RussTedrakeThis was a massive effort by the entire team, with a number of individuals really pouring their hearts into this paper. The paper is packed full of (too many?) details. Your comments and feedback would be very welcome.
- Replying to @RussTedrakeIn my mind, it's a bit like a biology paper that is focused on a particular animal model. I hope we'll learn more quickly from each other if we can make precise, substantiated claims about particular setups, so that as a field we can assemble those claims into a coherent picture.
- Replying to @anwesha_acyes. Of course the distribution and quality of the data matters.
- Replying to @RussTedrakeOne of the most interesting take-aways for me is that "high-performing policies need to know whether they are executing in sim or in real." A number of implications flow from that, including that sim+real cotraining can decrease performance if the visual gap is too small.
- Probably too late, but here's a notebook showing how to visualize it with graphviz: deepnote.com/workspace/Mani…





