Jesse Mu
666 posts
- Since prompting, instruction tuning, RLHF, ChatGPT etc are such new and fast-moving topics, I haven't seen many university course lectures covering this content. So we made some new slides for this year's CS224n: NLP w/ Deep Learning course at @Stanford! web.stanford.edu/class/cs224n/s…
- PSA to anyone who wants to write an op-ed criticizing LLMs (yes, including Noam Chomsky): if you're going to come up with hypothetical failure cases for LLMs, at a minimum, please actually check that your case fails with a modern LLM nytimes.com/2023/03/08/opi…
- I am announcing the Perverse Scaling Prize: a $1.14 USD prize for tasks which exhibit any of the following scaling curvesWe’re announcing the Inverse Scaling Prize: a $100k grand prize + $150k in additional prizes for finding an important task where larger language models do *worse*. Link to contest details: github.com/inverse-scalin… 🧵
- I recently moved to the Code RL team at Anthropic, and it’s been a wild and insanely fun ride. Join us! We are singularly focused on solving SWE. No 3000 elo leetcode, competition math, or smart devices. We want Claude n to build Claude n+1, so we can go home and knit sweaters.Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
- Excited to present 3 #NeurIPS2022 papers on a trend I've been very excited about recently: blurring the boundaries between language models and RL agents (+a bonus 4th paper on active learning!) 🧵(0/7) PS: I'm on the industry job market!
- Prompting is cool and all, but isn't it a waste of compute to encode a prompt over and over again? We learn to compress prompts up to 26x by using "gist tokens", saving memory+storage and speeding up LM inference: arxiv.org/abs/2304.08467 (w/ @XiangLisaLi2 and @noahdgoodman) 🧵
00:00 - TIL in 2009 two Berkeley undergrads flipped a coin *40,000* times (1hr/day for a semester) to see whether a coin flip was truly random (it's biased towards the side facing up pre-flip!) Gives a new meaning to the term "undergraduate research project"... stat.berkeley.edu/~aldous/Real-W…
- Life update: this week I joined the Alignment team @AnthropicAI! I’m starting part-time for now as I finish up my PhD at Stanford. Excited to work on making large language models safer and more aligned!
- We’re hiring for the adversarial robustness team @AnthropicAI! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)
- New preprint with @jacobandreas: we generate explanations of the individual neurons inside deep neural networks by identifying *compositional logical concepts* that closely approximate neuron behavior (e.g. "water that isn't blue") arxiv.org/abs/2006.14032 (1/5)
- My lecture on prompting, instruction tuning, and RLHF for Stanford's CS224n course is (finally!) available online: youtube.com/watch?v=SXpJ9E…Since prompting, instruction tuning, RLHF, ChatGPT etc are such new and fast-moving topics, I haven't seen many university course lectures covering this content. So we made some new slides for this year's CS224n: NLP w/ Deep Learning course at @Stanford! web.stanford.edu/class/cs224n/s…




















