Andrea Michi (@andreamichi) / X

Andrea Michi

437 posts

Andrea Michi

@andreamichi

CTO @depthfirstlabs - Autonomous security from design to production. Prev RL post-training Gemini @GoogleDeepMind

San Francisco

Joined March 2009

Andrea Michi
@andreamichi
Jul 6, 2025
This is the reason why I left DeepMind and decided to build an AI security company. I've seen first-hand what RL can do for code generation. Once you treat exploit generation as an RL problem, no software is safe.
Rohan Pandey
@khoomeik
Jun 25, 2025
the CIA is not ready for the RL era israeli intelligence guy just hacked into a live surveillance camera in front of me with an exploit generated by qwen vulnerable software is simulatable. penetration success is verifiable. hacking is RLable.
339K
Andrea Michi
@andreamichi
Jun 7, 2023
Today, I’m excited to introduce @DeepMind's AlphaDev, published in @Nature. It discovered faster sorting and hashing algorithms which have been released in standard open source libraries. But how does AlphaDev works? 🧵 dpmd.ai/3MMdUmE
74K
Andrea Michi
@andreamichi
Jul 6, 2025
Replying to @andreamichi
We are building intelligence to detect and remediate all software vulnerabilities. Finding vulnerabilities before someone else does
9.2K
Andrea Michi
@andreamichi
Jul 6, 2025
Replying to @MilosOfCroton
Pretty much! Exploits can be executed and verified. This provides a valuable reward signal
7.1K
Andrea Michi
@andreamichi
Jul 6, 2025
Lovable works so well because it does less: a good wrapper around Supabase + React. Turns out, that’s enough to build ~90% of the apps people actually want
9.5K
Andrea Michi
@andreamichi
Jul 6, 2025
Yes! Hiring strong security researchers with experience in vulnerability discovery and passionate about leveraging AI to push the boundaries. Do reach out if you're interested, my DMs are open
yak
@_murphin
Jul 6, 2025
Replying to @andreamichi
you hiring sec researchers?
6.6K
Andrea Michi
@andreamichi
Jun 7, 2023
Replying to @andreamichi
This project has been one of the most fun and challenging since I started at @DeepMind. But of course, it would have never happened without the amazing group of people working on this 🙏
1.1K
Andrea Michi
@andreamichi
Jul 5, 2023
Yesterday I gave a talk about AlphaDev at Imperial College. Afterwards, two students came up to me asking great questions on how to combine LLMs and RL planning. I later discovered they were high school students. Is this the new normal?
1.3K
Andrea Michi
@andreamichi
Jun 7, 2023
Replying to @andreamichi
These subroutines have now been integrated in the libc++ standard sorting library and are accessible to every developer and C++ application, making std::sort faster 🚀
1K
Andrea Michi
@andreamichi
Jun 7, 2023
Replying to @andreamichi
The idea is simple: we set up a game where each move is an assembly instruction and the player is rewarded based on the correctness of the final algorithm and its speed. We call this game Assembly Game and we let a Reinforcement Learning agent play it.
1.6K
Andrea Michi
@andreamichi
Jul 3, 2023
Last week, I was in Paris with the team, where we had a long discussion about RLAIF and self-improvement. This was before it became a hot topic on Twitter. Not surprisingly, after a few glasses of wine, the conversation quickly evolved into arguing about the meaning of "learning"
5.4K
Andrea Michi
@andreamichi
Jun 7, 2023
Replying to @andreamichi
We estimate that these fundamental algorithms are called trillions of times per day. They are the invisible backbone of today’s digital society. By making them more efficient, these improvements benefit billions of applications around the world.
1.2K
Andrea Michi
@andreamichi
Feb 18, 2024
Our paper on using RL for tokamak magnetic control has been recently published on the Fusion Engineering and Design journal. And while this is not about the latest LLMs, there are quite a few lessons learned on how to make RL work in applied domains sciencedirect.com/science/articl…
1.9K
Andrea Michi
@andreamichi
Sep 11, 2023
Looking forward to giving a talk tomorrow at the @CogX_Festival with @DJ_Mankowitz. We'll see how we can use AI (specifically RL) to optimise the entire computing stack. If you are attending CogX, join us at 14:45 BST at the 'Indigo at The O2 Arena' in London
683