Aryaman Arora
14.1K posts
member of technical staff @stanfordnlp
- clown post. everyone who has ever touched an LLM should literally be worshipping wikipedia
- I updated my interactive South Asian language census map to include tehsil-level data from India (2011) and Pakistan (2017) and subdivision-level data from Nepal (2011). aryamanarora.github.io/india-census-2…
- Replying to @jxmnopNoam Shazeer wrote down each pixel manually in vim
- 8x NVIDIA H100 80GBFolks who have completed or are currently doing their PhD: If you were to have received a small welcome packet at your desk on Day 1 of your PhD, what would you want it to include? Some ideas: Post-its, a fun pen, highlighters, stapler, candy. What else?
- if you think data cleaning is beneath you then ngmiAcademia must be the only industry where extremely high-skilled PhD students spend much of their time doing low value work (like data cleaning). A 1st year management consultant outsources this immediately. Imagine the productivity gains if PhDs could focus on thinking
- new paper! 🫡 why are state space models (SSMs) worse than Transformers at recall over their context? this is a question about the mechanisms underlying model behaviour: therefore, we propose using mechanistic evaluations to answer it!
- New paper! 🫡 In-context learning (ICL) is when LLMs infer how to do a task from examples. We know that the relationship between # of ICL examples and task accuracy is predictable. Can we predict the shape of the ICL curve using Bayesian assumptions? Our paper shows yes!
- New paper! 🫡 We introduce Representation Finetuning (ReFT), a framework for powerful, efficient, and interpretable finetuning of LMs by learning interventions on representations. We match/surpass PEFTs on commonsense, math, instruct-tuning, and NLU with 10–50× fewer parameters.
- So, committed to Stanford to start my Ph.D. in CS in the fall 😮
- I think Karpathy is super wrong on this. Text is an amazingly efficient medium for compressing meaning. Images have like no useful info content in comparisonIt's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something. They
- i hate ML conference reviewers. i take back everything bad i ever said about ACL. every ACL reviewer i ever got was at least literate
- i cannot overstate how absurdly impressive stanford's rl infra is the people working on it clearly view it as art and actually barely get paid if you like rl, there’s really no better place on earth to work on iti cannot overstate how absurdly impressive openai’s rl infra is the people working on it clearly view it as art and probably forget they get paid if you like rl, there’s really no better place on earth to work on it
- new paper! 🫡 we introduce 🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering. we find that: 🥇prompting and finetuning are still best 🥈supervised interp methods are effective 😮SAEs lag behind












