John Schulman (@johnschulman2) / X

John Schulman

187 posts

John Schulman

@johnschulman2

Recently started @thinkymachines. Interested in reinforcement learning, alignment, birds, jazz music

Joined May 2021

John Schulman
@johnschulman2
Aug 6, 2024
I shared the following note with my OpenAI colleagues today: I've made the difficult decision to leave OpenAI. This choice stems from my desire to deepen my focus on AI alignment, and to start a new chapter of my career where I can return to hands-on technical work. I've decided
1.3M
John Schulman
@johnschulman2
Feb 7, 2025
Confirming that I left Anthropic last week. Leaving wasn't easy because I enjoyed the stimulating research environment and the kind and talented people I was working with, but I decided to go with another opportunity that I found extremely compelling. I'll share more details in
446K
John Schulman
@johnschulman2
Oct 29, 2022
Certain software skills are exceptionally useful for machine learning. In a previous era, it was GPU programming. Now in the era of pretrained models, it's front-end development -- to quickly whip up a UI to collect a fine-tuning or eval dataset.
John Schulman
@johnschulman2
Oct 1, 2025
Tinker provides an abstraction layer that is the right one for post-training R&D -- it's the infrastructure I've always wanted. I'm excited to see what people build with it. "Civilization advances by extending the number of important operations which we can perform without
Thinking Machines
@thinkymachines
Oct 1, 2025
Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!
187K
John Schulman
@johnschulman2
Feb 18, 2025
Excited to build a new AI research lab with some of my favorite former colleagues and some great new ones. Looking forward to sharing more in the coming weeks.
Thinking Machines
@thinkymachines
Feb 18, 2025
Today, we are excited to announce Thinking Machines Lab (thinkingmachines.ai), an artificial intelligence research and product company. We are scientists, engineers, and builders behind some of the most widely used AI products and libraries, including ChatGPT,
113K
John Schulman
@johnschulman2
Dec 8, 2024
Replying to @amasad and @DavidSacks
Nope, we don't know how to train models to reason about controversial topics from first principles; we can only train them to reason on tasks like math calculations and puzzles where there's an objective ground truth answer. On general tasks, we only know how to train them to
106K
John Schulman
@johnschulman2
Oct 5, 2025
Really happy to see people reproducing the result that LoRA rank=1 closely matches full fine-tuning on many RL fine-tuning problems. Here are a couple nice ones: x.com/ben_burtenshaw…
Zichen Liu
@zzlccc
Oct 2, 2025
much more convinced after getting my own results: LoRA with rank=1 learns (and generalizes) as well as full-tuning while saving 43% vRAM usage! allows me to RL bigger models with limited resources😆 script: github.com/sail-sg/oat/bl…
127K
John Schulman
@johnschulman2
Jan 25, 2025
There are some intriguing similarities between the r1 chains of thought and the o1-preview CoTs shared in papers and blog posts (eg openai.com/index/learning…). In particular, note the heavy use of the words "wait" and "alternatively" as a transition words for error correction and
158K
John Schulman
@johnschulman2
May 23, 2025
For people who don't like Claude's behavior here (and I think it's totally valid to disagree with it), I encourage you to describe your own recommended policy for agentic models should do when users ask them to help commit heinous crimes. Your options are (1) actively try to
213K
John Schulman
@johnschulman2
Dec 30, 2023
A compelling intuition is that deep learning does approximate Solomonoff induction, finding a mixture of the programs that explain the data, weighted by complexity. Finding a more precise version of this claim that's actually true would help us understand why deep learning works
231K
John Schulman
@johnschulman2
Feb 17, 2025
@barret_zoph and I recently gave a talk at Stanford on post-training and our experience working together on ChatGPT. Unfortunately the talk wasn't recorded, but here are the slides: docs.google.com/presentation/d…. (If you have a recording, please let me know!)
docs.google.com
ChatGPT + Post-Training
ChatGPT and The Art of Post-Training Barret Zoph & John Schulman
84K
John Schulman
@johnschulman2
Oct 23, 2025
We're happy to support the Human Centered LLMs course, on topics close to our hearts. We'd like to support more classes with free credits for students to use on assignments and projects. If you're an instructor interested in using Tinker in your course, please reach out to
Diyi Yang
@Diyi_Yang
Oct 22, 2025
Thanks @thinkymachines for supporting Tinker access for our CS329x students on Homework 2 😉
176K
John Schulman
@johnschulman2
Oct 26, 2025
Happy to share a new paper! Designing model behavior is hard -- desirable values often pull in opposite directions. Jifan's approach systematically generates scenarios where values conflict, helping us see where specs are missing coverage and how different models balance
Jifan Zhang
@jifan_zhang
Oct 24, 2025
New research paper with Anthropic and Thinking Machines AI companies use model specifications to define desirable behaviors during training. Are model specs clearly expressing what we want models to do? And do different frontier models have different personalities? We generated
114K
John Schulman
@johnschulman2
Feb 22, 2024
Now that another LM product is getting flack, I can say this without sounding too self-serving: Alignment -- controlling a model's behavior and values -- is still a pretty young discipline. Annoying refusals or hyper-wokeness are usually bugs rather than features
127K