Sam Bowman (@sleepinyourhat) / X

Sam Bowman

2,863 posts

Sam Bowman

@sleepinyourhat

AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. Into @givingwhatwecan.

San Francisco

sleepinyourhat.github.io

Joined July 2011

Sam Bowman
@sleepinyourhat
Mar 23, 2023
As a specialist in evaluating language models, I declare that this is the best way of evaluating language models:
361K
Sam Bowman
@sleepinyourhat
Feb 9, 2022
PhD admissions season is ramping up, so I feel obliged to join the chorus of voices reminding everyone that doing a PhD is, in most cases, a terrible idea.
Sam Bowman
@sleepinyourhat
May 22, 2025
🧵✨🙏 With the new Claude Opus 4, we conducted what I think is by far the most thorough pre-launch alignment assessment to date, aimed at understanding its values, goals, and propensities. Preparing it was a wild ride. Here’s some of what we learned. 🙏✨🧵
397K
Sam Bowman
@sleepinyourhat
May 13, 2022
I just got tenure! Wheee! Predictable-but-heartfelt gratitude thread:
Sam Bowman
@sleepinyourhat
Apr 2, 2023
I’m sharing a draft of a slightly-opinionated survey paper I’ve been working on for the last couple of months. It's meant for a broad audience—not just LLM researchers. (🧵)
711K
Sam Bowman
@sleepinyourhat
Oct 7, 2022
I’m starting an AI safety research group at NYU. Why? (🧵)
Sam Bowman
@sleepinyourhat
May 22, 2025
I deleted the earlier tweet on whistleblowing as it was being pulled out of context. TBC: This isn't a new Claude feature and it's not possible in normal usage. It shows up in testing environments where we give it unusually free access to tools and very unusual instructions.
347K
Sam Bowman
@sleepinyourhat
Jan 8, 2024
I'm hiring research engineers for several alignment/technical safety teams at Anthropic!
130K
Sam Bowman
@sleepinyourhat
Aug 27, 2025
Early this summer, OpenAI and Anthropic agreed to try some of our best existing tests for misalignment on each others’ models. After discussing our results privately, we’re now sharing them with the world. 🧵
116K
Sam Bowman
@sleepinyourhat
Apr 19, 2022
AI/ML faculty: A student of mine did an internship at Google, and got the resulting paper accepted to a top conference. The host team isn't willing to pay for conference registration, so I'll have to pay or else the paper won't be published, going against the norm here. Advice?
Sam Bowman
@sleepinyourhat
Dec 7, 2022
Everybody, please stop publishing interesting research. I'm trying to have a sabbatical.
Sam Bowman
@sleepinyourhat
Oct 15, 2021
You'll sometimes see the meme that NLP is solved. That's hype, and it's doing harm in the real world. But it's worth thinking about what it'd look like to actually achieve what we're aiming for. (📄 paper, thread 🧵) cims.nyu.edu/~sbowman/bowma…
Sam Bowman
@sleepinyourhat
Jul 26, 2021
I'll likely admit a couple new PhD students this year. If you're interested in NLP and you have experience either in crowdsourcing/human feedback for ML or in AI truthfulness/alignment/safety, consider @NYUDataScience!
Sam Bowman
@sleepinyourhat
Sep 3, 2024
A big part of my job these days is to think about what technical work Anthropic needs to do to make things go well with the development of very powerful AI. I digested my thinking on this, plus some of the Anthropic zeitgeist around it, into this piece: sleepinyourhat.github.io/checklist/
70K