About me

I am a research scientist at Google DeepMind focusing on AI alignment: ensuring that advanced AI systems try to do what we want them to do and don’t knowingly act against our interests. I have worked on various topics in AI safety including deceptive alignment, dangerous capability evaluations, specification gaming, goal misgeneralization, and avoiding side effects.

I co-founded the Future of Life Institute, a non-profit organization working to mitigate technological risks to humanity and increase the chances of a positive future.

(The views expressed on this website are my own and do not represent Google DeepMind or the Future of Life Institute.)

My PhD in statistics and machine learning at Harvard focused on building interpretable models.

In my spare time, I enjoy playing with my kids and spending time in nature.

Find me on TwitterGoogle Scholar, GitHub, LinkedIn, Alignment Forum.