Thrilled to share that I’m joining @AnthropicAI !
After 5.5 amazing years at Alphabet, including working on Gemini’s reasoning over the past 2 years, I’m looking forward to advancing Claude’s ability to tackle complex reasoning challenges across a diverse range of domains!
Some people say that one shouldn't care about publication and the quality matters. However, the job market punishes those who don’t have publications in top ML venues. I empathize with students and newcomers to ML whose good papers are not getting accepted. #ICLR2021
1/
Excited to announce that the entire Blueshift team has joined @DeepMind! We will be working with @OriolVinyalsML and others to advance capabilities of LLMs developed by DM / Alphabet! We hope to continue to grow DM's presence in Bay Area and New York in the coming months :-)
These days, many people are interested in getting a PhD in ML. I think you should think really hard before committing to a PhD program in ML. Why?
I'm going to summarize some thoughts in this thread:
1/10
Totally agree!
Anyone screening applications and any applicant thinking their CV is not representative of their skills/potentials, I think you might want to read the story of my own PhD application in this thread:
1/
Any document claiming an easy way to gauge grad school applicants needs to be challenged
To wit: While 2 or 3 Unis in Iran are far more selective than others, the # of outstanding candidates far exceeds their enrollment. The given ranking is thus opinion not fact & is misleading
💡💡What is the best acc an MLP can get on CIFAR10❓
65%❓ No, 85%‼️
Trying to understand convolutions, we look at MDL and come up with a variant of LASSO that when applied to MLPs, it learns local connections and achieves amazing accuracy!
Paper: arxiv.org/abs/2007.13657
1/n
When @ethansdyer and I joined Anthropic last Dec and spearheaded the discovery team, we decided to focus on unlocking computer-use as a bottleneck for scientific discovery. It has been incredible to work on improving computer-use and witness the fast progress. In OSWorld for
Very excited to announce a significant milestone in expanding reasoning capabilities of language models! 🎉🎉
We introduce #Minerva🦉: a language model that can solve mathematical questions using step-by-step natural language reasoning:
bit.ly/3OBj2d5
🧵
1/
Very excited to present Minerva🦉: a language model capable of solving mathematical questions using step-by-step natural language reasoning.
Combining scale, data and others dramatically improves performance on the STEM benchmarks MATH and MMLU-STEM. goo.gle/3yGpTN7
Looking back, I think the moment that I was asked "How can you prove that you are hardworking and a fast learner?" was an absolutely pivotal event in my life and I am forever grateful for that opportunity.
13/
We often prefer collaborating with people we know or those of high status. That makes it very difficult for hardworking and motivated junior researchers to get enough support to flourish.
Is it possible to reduce this barrier?
I'v been running some experiments to find out!
1/6
You think the RNN era is over? Think again!
We introduce "Block-Recurrent Transformer", which applies a transformer layer in a recurrent fashion & beats transformer XL on LM tasks.
Paper: arxiv.org/pdf/2203.07852…
W. DeLesley Hutchins, Imanol Schlag, @Yuhu_ai_ & @ethansdyer
1/
@ethansdyer and I have started a new team at @AnthropicAI — and we’re hiring!
Our team is organized around the north star goal of building an AI scientist: a system capable of solving the long-term reasoning challenges and core capabilities needed to push the scientific