Ari Holtzman (@universeinanegg) / X

Ari Holtzman

2,049 posts

Ari Holtzman

@universeinanegg

Asst Prof @UChicagoCS & @DSI_UChicago, leading Conceptualization Lab conceptualization.ai Minting new vocabulary to conceptualize generative models.

Chicago

Joined July 2015

Pinned
Ari Holtzman
@universeinanegg
May 12
LLMs reveal secrets when they’re asked to write stories. We told LLMs not to reveal the secret words we gave them, then asked them to write stories. The secret x word never appears literally. But another model can identify it from the story up to 79% of the time.
8.5K
Ari Holtzman
@universeinanegg
Oct 28, 2025
I'm recruiting PhD students! I'm interested in: 1. Understanding how LLMs 'see' the world (ex: LMs can't see conspicious omissions, see AbsenceBench) 2. How can we make things with LLMs that have never been made before? (ex: Communnication Games, see 📌) 3. See my other posts :)
83K
Ari Holtzman
@universeinanegg
Apr 22, 2020
"You can't learn language from the radio." 📻 Why does NLP keep trying to? In arxiv.org/abs/2004.10151 we argue that physical and social grounding are key because, no matter the architecture, text-only learning doesn't have access to what language is *about* and what it *does*.
Ari Holtzman
@universeinanegg
Jul 8, 2023
While demand for generative model training soars 📈, I think a new field is coalescing that’s focused on trying to make sense of generative models _once they’re already trained_: characterizing their behaviors, differences, and underlying mechanisms…so we wrote a paper about it!
110K
Ari Holtzman
@universeinanegg
Nov 21, 2023
If you want a respite from OpenAI drama, how about joining academia? I'm starting Conceptualization Lab, recruiting PhDs & Postdocs! We need new abstractions to understand LLMs. Conceptualization is the act of building abstractions to see something new.
Conceptualization Lab
From conceptualization.ai
117K
Ari Holtzman
@universeinanegg
Jul 9, 2023
In other news, I’ll be joining @UChicagoCS and @DSI_UChicago in 2024 as an assistant professor and doing a postdoc @Meta in the meantime! I’m at ACL in person and recruiting students who want to find fresh approaches to working with generative models so if that’s you let’s chat!
Ari Holtzman
@universeinanegg
Jul 8, 2023
While demand for generative model training soars 📈, I think a new field is coalescing that’s focused on trying to make sense of generative models _once they’re already trained_: characterizing their behaviors, differences, and underlying mechanisms…so we wrote a paper about it!
63K
Ari Holtzman
@universeinanegg
Oct 22, 2024
I am recruiting PhDs and Postdocs via the: • CS Dept cs.uchicago.edu/academics/admi… (12/16) • Data Science Institute (DSI) codas.uchicago.edu/how-to-apply/ (12/17) • DSI Scholars program datascience.uchicago.edu/research/postd… (rolling) Come rigorously conceptualize what LLMs do! conceptualization.ai
cs.uchicago.edu
Admission - Department of Computer Science
Undergraduate Admission The computer science program prepares students for careers in computer science by offering BA and BS degrees, as well as combined BA/MS and BS/MS degrees. For more informat...
29K
Ari Holtzman
@universeinanegg
Apr 16, 2021
🔨ranking by probability is suboptimal for zero-shot inference with big LMs 🔨 “Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right” explains why and how to fix it, co-lead w/ @PeterWestTM paper: peterwestuw.github.io/surface-form-c… code: github.com/peterwestuw/su…
Ari Holtzman
@universeinanegg
Jul 9, 2025
Prompting is our most successful tool for exploring LLMs, but the term evokes eye-rolls and grimaces from scientists. Why? Because prompting as scientific inquiry has become conflated with prompt engineering. This is holding us back. 🧵and new paper:
arxiv.org
Prompting as Scientific Inquiry
Prompting is the primary method by which we study and control large language models. It is also one of the most powerful: nearly every major capability attributed to LLMs-few-shot learning,...
14K
Ari Holtzman
@universeinanegg
Dec 28, 2020
literally just try and imagine NLP without @huggingface that would be sad. thanks for making my year y'all!
Ari Holtzman
@universeinanegg
Jul 6, 2022
I'm noticing a shift in NLP from designing models for tasks to discovering novel behavior in models *after training*, e.g. in-context learning. But I think we're missing key vocabulary to breakdown model behavior, as if we were trying to explain steam without the concept of water
Ari Holtzman
@universeinanegg
Nov 11, 2025
Some of my favorite researchers in the world trying to make LLMs make sense by collapsing two ideas into one—my favorite aspect of theory building!
Goodfire
@GoodfireAI
Nov 11, 2025
New research: are prompting and activation steering just two sides of the same coin? @EricBigelow @danielwurgaft @EkdeepL and coauthors argue they are: ICL and steering have formally equivalent effects. (1/4)
15K
Ari Holtzman
@universeinanegg
Nov 9, 2025
LLMs don't accumulate information over the course of a text the way you'd hope! I think this is why LLMs often feel 'fixated on the wrong thing' or 'overly literal'—they are usually responding using the most relevant single thing they remember, not the aggregate of what was said
Amanda Bertsch
@abertsch72
Nov 7, 2025
Can LLMs accurately aggregate information over long, information-dense texts? Not yet… We introduce Oolong, a dataset of simple-to-verify information aggregation questions over long inputs. No model achieves >50% accuracy at 128K on Oolong!
13K
Ari Holtzman
@universeinanegg
Aug 7, 2024
Some sad news: My father died unexpectedly a bit over a week ago. We were close. Apologies for my delayed response times, which will continue for a while—there's so much to take care of, in addition to having just moved to a new city for a new job, and needing time to grieve.
15K