After spending billions of dollars of compute, GPT-5 learned that the most effective use of its token budget is to give itself a little pep talk every time it figures something out.
Maybe you should do the same.
Bob McGrew
587 posts
Learning new things.
Former Chief Research Officer at OpenAI, early exec at Palantir, early employee at Paypal.
Joined June 2018
- Today was my last day at OpenAI. I wish the team the very best!
- The important breakthrough in OpenAI’s Deep Research is that the model is trained to take actions as part of its CoT. The problem with agents has always been that they can’t take coherent action over long timespans. They get distracted and stop making progress. That’s now fixed.
- This evening I will become an officer in US Army Reserve as part of Detachment 201, the Army’s Executive Innovation Corps, alongside @boztank , @kevinweil, and @ssankar. I am proud to have the opportunity to serve.
- Many of OpenAI's greatest researchers did not have PhDs in AI. Building a path for brilliant technical people without experience to become researchers was critical for our success."Member of the technical staff" is the hottest job title in SF right now. What's behind the name? @OpenAI chose this title deliberately to blow up the previous industry dichotomy between researchers and engineers. The best researchers in AI right now aren't academics in a pure
00:00 - That o1 is better than GPT-4.5 on most problems tells us that pre-training isn't the optimal place to spend compute in 2025. There's a lot of low-hanging fruit in reasoning still. But pre-training isn't dead, it's just waiting for reasoning to catch up to log-linear returns.
- Congrats to the research team on the o3 and o3-mini announcements! These are great models. And, yes... you've reached a new high for OpenAI's tradition of truly terrible naming. 😂
- OpenAI is nothing without its people
- Don't be disappointed that GPT-4.5 isn't smarter than o1. Scaling up pretraining improves responses across the board. Scaling up reasoning improves responses a lot if they benefit from thinking time and not much otherwise. Wait to find out how the improvements stack together.
- The defining question for AGI isn’t “How smart is it?” but “What fraction of economically valuable work can it do?” The spotlight for o3 is on tool use because intelligence is no longer the primary constraint. The new frontier is reliable interaction with the external world.
- Don't get too excited about videos of humanoid robots dancing. Manipulation is the hard problem we need to solve to make humanoid robots useful, not locomotion. One video of a humanoid slowly performing tasks with its hands is worth a dozen clips of choreographed routines.
- Terence Tao on what math looks like after o1...
- For the last two years, the leading lab (usually OpenAI) has released capabilities that were matched by other labs 9-12 months later, only to leapfrog them again soon after. With RL in 2025, this cycle will speed up so capabilities are matched and leapfrogged every 2-3 months.





