Tom Zahavy (@TZahavy) / X

Tom Zahavy

571 posts

Tom Zahavy

@TZahavy

Building creative agents @GoogleDeepMind. AlphaProof, AlphaZero_db, PuzzleGen, Convex RL, meta gradients. Staff research scientist, discovery team

London, England

Joined December 2018

Pinned
Tom Zahavy
@TZahavy
Oct 29, 2025
I am excited to share a work we did in the Discovery team at @GoogleDeepMind using RL and generative models to discover creative chess puzzles 🔊♟️♟️ #neurips2025 🎨While strong chess players intuitively recognize the beauty of a position, articulating the precise elements that
398K
Tom Zahavy
@TZahavy
Apr 10, 2025
I am looking to hire a student researcher to work with AlphaProof on a project at the intersection of AI, math, computation, and creativity. Background in AI for math, and/or Lean is desired. If interested, please get in touch. The position will be based in London.
55K
Tom Zahavy
@TZahavy
Aug 21, 2023
I'm super excited to share AlphaZeroᵈᵇ, a team of diverse #AlphaZero agents that collaborate to solve #Chess puzzles and demonstrate increased creativity. Check out our paper to learn more! arxiv.org/abs/2308.09175 A quick 🧵(1/n)
95K
Tom Zahavy
@TZahavy
Nov 12, 2025
Excited to announce our recent @GoogleDeepMind paper, AlphaProof, out in @Nature today! It has been over a year since AlphaProof achieved silver-medal standard solving International Mathematical Olympiad (IMO) problems, by teaching itself mathematics in LEAN (@leanprover).
44K
Tom Zahavy
@TZahavy
Apr 7, 2024
We are looking for brilliant and creative candidates with strong programming skills to join us at the Discovery team at @GoogleDeepMind 🧙 We build AI agents that discover new knowledge using RL, planning and LLMs. DM me if you have specific questions about working with us 🙏
53K
Tom Zahavy
@TZahavy
Jul 29, 2024
We are looking for brilliant and creative candidates with strong programming skills to join us at the Discovery team at @GoogleDeepMind 🧙 We are building AI agents that create new knowledge using RL, planning and LLMs in domains like Mathematics, chess and more. Please apply
57K
Tom Zahavy
@TZahavy
Dec 3, 2021
In our #Neurips2021 spotlight, we study RL problems where the goal is to minimize a cost over the state occupancy. When this cost is linear, we get the standard RL problem. When it is non-linear, we get apprenticeship learning, pure exploration, diversity and more. [1/7]
Tom Zahavy
@TZahavy
May 27, 2022
Excited to share DOMiNO, a method to discover qualitative-diverse policies using a single latent-conditioned architecture and the "reward is enough" principle. Read more about it here: arxiv.org/pdf/2205.13521… DOMiNO's🍕 in Walker walk:
00:00
Tom Zahavy
@TZahavy
Apr 22, 2022
Super excited to share that our Bootstrapped Meta Learning paper led by @flennerhag received an Outstanding Paper Award from #iclr2022 Better meta learning -> doubled the performance of STACX in Atari to a new SOTA. Come talk with us at the poster session!blog.iclr.cc/2022/04/20/ann…
Sebastian Flennerhag
@flennerhag
Sep 13, 2021
What should a meta-learner optimize? What if we make it chase its own future outputs? Turns out, it can improve meta-optimization, set new SOTAs, and lead to new types of meta-learning. arxiv.org/pdf/2109.04504… w. Y. Schroecker, @tomzhavy, @hado, D. Silver, S. Singh. 🧵👇
Tom Zahavy
@TZahavy
Nov 13, 2025
We are hiring students in the discovery team. If you are interested in creativity and RL, consider applying ❤️
Alex Havrilla
@Dahoas1
Nov 13, 2025
📣 Hiring Alert: Student Researcher - 2026 @vivek_veeriah and I are looking for a PhD Student Researcher to join the GDM Discovery team in London 🇬🇧! We will be investigating how creativity in LLMs generalizes, with application to scientific discovery 🔭 Apply below! ⬇️
13K
Tom Zahavy
@TZahavy
Jul 25, 2024
Very excited to share AlphaProof, an agent that self-taught itself Mathematics in Lean and achieved a silver-medal standard in the International Math Olympiad 🥈🥈🥈🥈 @leanprover is a functional programming language for formal Mathematics and a theorem prover. It enables you to
Google DeepMind
@GoogleDeepMind
Jul 25, 2024
We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈 It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵 dpmd.ai/imo-silver
GIF
13K
Tom Zahavy
@TZahavy
May 8, 2021
A rejection story with a happy end. A paper from my #Phd was accepted to #ICML2021 after 4-5 rejections (I lost count honestly). Each time we had reviewers that liked it and some that didn’t. Believing in it and keep improving it over time eventually got it in. Don’t loose hope!
Tom Zahavy
@TZahavy
Oct 29, 2025
Replying to @TZahavy
Read more about it: ♟️ @chesscom blogpost: chess.com/news/view/ai-l… 💻Booklet & Review: arxiv.org/abs/2510.23772 📃Paper: arxiv.org/abs/2510.23881
DeepMind's AI Learns To Create Original Chess Puzzles, Praised By GMs
From chess.com
11K
Tom Zahavy
@TZahavy
Sep 15, 2022
Late on arXiv (oral @CoLLAs_Conf): @jelennal_ who did a fantastic internship with us at the Discovery team @DeepMind studies how adding context to meta gradients can help agents to adapt when the environment changes. Thanks for sharing @_akhaliq
AK
@_akhaliq
Sep 14, 2022
Meta-Gradients in Non-Stationary Environments abs: arxiv.org/abs/2209.06159