GRPO was the workhorse of DeepSeek R1's training algorithm. To go beyond the Cartesian brain-in-a-box setup and to obtain the Monte-Carlo advantage estimates that GRPO requires, you will need to perform massive parallel branching rollouts against complex environments (an entire
New paper, new SOTA in unsupervised neural machine translation, joint with colleagues at @OpenAI:
Unsupervised Neural Machine Translation with Generative Language Models Only
arxiv.org/abs/2110.05448
i remember once, i was showing ilya the results of weeks' worth of experiments with the conclusion that what we wanted to work wasn't going to work. his instantaneous response was:
"if you torture the data long enough, it will confess"
by which he meant that the data would
the Mathstral MMLU breakdown is the funniest thing i've seen all week - mathematical ability negatively correlated with accounting, foreign policy, human sexuality
Excited to share this demo of interactive neural theorem proving in Lean (joint WIP with Jason Rute, @Yuhu_ai_ , Ed Ayers, and @spolu)!
Below, the `gpt` tactic is querying a 3B param transformer trained on Lean proofs. We can prove 35.9% of theorems in a held-out test set.
Offer from @saranormous @w_conviction's Embed rescinded after 3 days because I wanted to do due diligence on a 1 year old firm launching its first accelerator program. This is what we call "bad behavior"
Know who you're letting on your cap table.
reminds me of
"This book arose out of the author’s teaching a summer course to a bunch of Viennese 4th graders while on a long trans-Atalantic flight. A solid grasp of the multiplication tables should suffice for chapters 1 - 4, although experience with projective modules over
Excited to share that our team has cooked up a multi-agent coding framework and setting the new State-of-the-Art on SWE-Bench-Lite with 40.3% accepted solutions!
Very soon this framework will work right in your editor with the developer working along with agent(S)
Proud to share this work over the past year on neural theorem proving in @leanprover, joint with @spolu, @ilyasut and the support of many others at @OpenAI!
Excited to share the next chapter for Morph and honored to be working with such great investors - AI SWE agents will soon vastly outnumber human software engineers, and we will lead the way in building AI-centric infrastructure to support them.
We hiring!