Sheryl Hsu (@SherylHsu02) / X

Sheryl Hsu

53 posts

Sheryl Hsu

@SherylHsu02

@openai | bs/ms @Stanford👩🏻‍💻

Joined August 2020

Sheryl Hsu
@SherylHsu02
Aug 11, 2025
1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻
2.5M
Sheryl Hsu
@SherylHsu02
Jul 19, 2025
Watching the model solve these IMO problems and achieve gold-level performance was magical. A few thoughts 🧵
Alexander Wei
@alexwei_
Jul 19, 2025
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
661K
Sheryl Hsu
@SherylHsu02
Jul 19, 2025
Replying to @SherylHsu02
It’s crazy how we’ve gone from 12% on AIME (GPT 4o) → IMO gold in ~ 15 months. We have come very far very quickly. I wouldn’t be surprised if by next year models will be deriving new theorems and contributing to original math research!
231K
Sheryl Hsu
@SherylHsu02
Jul 19, 2025
Replying to @SherylHsu02
The model solves these problems without tools like lean or coding, it just uses natural language, and also only has 4.5 hours. We see the model reason at a very high level - trying out different strategies, making observations from examples, and testing hypothesis.
73K
Sheryl Hsu
@SherylHsu02
Jul 19, 2025
Replying to @SherylHsu02
It’s been a blast working with everyone @OpenAI, esp @alexwei_ @polynoamial for this project. I joined 3 months ago and people are so smart and kind - although sometimes they threaten you with a sword, make the model produce correct solutions or suffer!
43K
Sheryl Hsu
@SherylHsu02
Aug 11, 2025
Replying to @SherylHsu02
2/n We officially competed in the online AI track of the IOI, where we scored higher than all but 5 (of 330) human participants and placed first among AI participants. We had the same 5 hour time limit and 50 submission limit as human participants. Like the human contestants, our
54K
Sheryl Hsu
@SherylHsu02
Jul 19, 2025
Replying to @SherylHsu02
I was particularly motivated to work on this project because this win came from general research advancements. Beyond just math, we will improve on other capabilities and make ChatGPT more useful over the coming months.
36K
Sheryl Hsu
@SherylHsu02
Aug 11, 2025
Replying to @SherylHsu02
4/n This result demonstrates a huge improvement over @OpenAI’s attempt at IOI last year where we finished just shy of a bronze medal with a significantly more handcrafted test-time strategy. We’ve gone from 49th percentile to 98th percentile at the IOI in just one year!
47K
Sheryl Hsu
@SherylHsu02
Aug 11, 2025
Replying to @SherylHsu02
6/n I’ve been lucky to work with many fantastic teammates here at @OpenAI, specifically with @alexwei_ @bminaiev @oleg_murk for prepping for IOI and building on top of the long term work on competitive programming by @_lorenzkuhn @MostafaRohani @clavera_i @andresnds @ahelkky
33K
Sheryl Hsu
@SherylHsu02
Aug 11, 2025
Replying to @SherylHsu02
3/n We competed with an ensemble of general-purpose reasoning models---we did not train any model specifically for the IOI. Our only scaffolding was in selecting which solutions to submit and connecting to the IOI API.
36K
Sheryl Hsu
@SherylHsu02
Aug 11, 2025
Replying to @SherylHsu02
5/n It’s been really exciting to see the progress of our newest research methods at OpenAI, with our successes at the AtCoder World Finals, IMO, and IOI over the last couple weeks. We’ve been working hard on building smarter, more capable models, and we’re working hard to get
32K
Sheryl Hsu
@SherylHsu02
Aug 11, 2025
Replying to @SherylHsu02
I, along with some teammates, were able to travel to Bolivia to attend the IOI in person. It was wonderful to meet all the participants and coaches there, and we wanted to say congrats once again!!
34K
Sheryl Hsu
@SherylHsu02
Jul 19, 2025
Replying to @polynoamial
lucky to be one of the agents on multi-agent, it's a blast!!
9.2K
Sheryl Hsu
@SherylHsu02
Oct 31, 2024
Feeling spooked👻🎃? Get grounded...introducing "Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval." Meet LeReT (Learning to Retrieve by Trying), a RL-based framework that improves LLM’s ability to use retrieval tools by up to 29%. sherylhsu.com/LeReT/
GIF
70K