I'm recruiting PhD students for our new lab, coming to Boston University in Fall 2025!
Our lab aims to understand, improve, and precisely control how language is learned and used in natural language systems (such as language models).
Details below!
Aaron Mueller
324 posts
- Announcing the BabyLM 👶 Challenge, the shared task at @conll_conf and CMCL'23! We’re calling on researchers to pre-train language models on (relatively) small datasets inspired by the input given to children learning language. babylm.github.io arxiv.org/abs/2301.11796
- ⭐️🏛️ Very excited to announce that in fall 2025, I’ll be starting as an Assistant Professor of Computer Science at Boston University @BU_Tweets! Looking forward to joining a wonderful group of colleagues at @BUCompSci!
- What is cause and effect? What is a “mechanism”? And how do answers to these questions affect interpretability research? 📜 New preprint! 📜 Two key challenges for causal/mechanistic interpretability, and ways forward. To be presented at the mech interp workshop at #ICML2024:
- We know that pre-trained seq2seq models such as T5 perform well on many downstream NLP tasks. Turns out, pre-training also teaches them about the hierarchical structure of language! 📜arxiv.org/abs/2203.09397 👨🏻💻github.com/sebschu/multil… w/ @bob_frank, @tallinzen, Wes Wang, @sebschu
- Scaling LMs works well. Is more parameters and data all it takes, or do certain architectural features or language styles bring out emergent abilities sooner? Let’s investigate by seeing what it takes for syntax 🌳 to emerge! At ACL! w/ @tallinzen 📜 arxiv.org/abs/2305.19905
- Thanks Tal! 📜 In this paper, we provide a theoretically grounded review of causal (which, imo, ⊇ mechanistic) interpretability. We argue that this gives a more cohesive narrative of the field, and makes it easier to see actionable open directions for future work! 🧵I very much enjoyed this survey of causal interpretability methods for neural networks from @amuuueller and many others - succinct, well organized, just opinionated enough. please write more reviews everyone arxiv.org/abs/2408.01416
- Language models are good at subject-verb agreement, even across center-embedded structures. Which neurons are responsible for this? Depends on the syntactic structure! arxiv.org/pdf/2106.06087… w/ @mattf1n, me, @sebgehr, @shieber, @tallinzen, & @boknilev. To appear at ACL'21!
- To those using in-context learning: LLMs behave differently on in-distribution vs. out-of-distribution examples—and chain-of-thought prompting has different effects on them! New preprint w/ @albertwebson, @jowenpetty, @tallinzen 📜 arxiv.org/abs/2311.07811
- What can mechanistic interpretability do for computational psycholinguists? @michaelwhanna and I took a stab at this question! We investigate garden path sentence processing in LMs at the feature (circuit) level.Sentences are partially understood before they're fully read. How do LMs incrementally interpret their inputs? In a new paper @amuuueller and I use mech interp to study how LMs process structurally ambiguous sentences. We show LMs rely on both syntactic & spurious features! 1/10
- Excited this project is out! Using sparse feature circuits, we can explain and modify how LMs arrive at a behavior. In this thread, I want to highlight open directions where computational linguists can use sparse feature circuits. 🧵Can we understand & edit unanticipated mechanisms in LMs? We introduce sparse feature circuits, & use them to explain LM behaviors, discover & fix LM bugs, & build an automated interpretability pipeline! Preprint w/ @can_rager, @ericjmichaud_, @boknilev, @davidbau, @amuuueller
GIF - On my way to #NAACL24! 🇲🇽 Friends and folks interested in evaluation, (mechanistic) interpretability, causality, robustness, psycholinguistics, and/or coffee, let’s meet up? (And if you’re interested in doing a PhD in any/all of these topics, I would love to chat!)
- The evaluation pipeline for the BabyLM 👶 Challenge is out! We’re evaluating on BLiMP and a selection of (Super)GLUE tasks. Code 💻:
- I’ll be in Abu Dhabi starting tomorrow for EMNLP! Come see my CoNLL talk about causally probing for syntax in multilingual LMs! I’ll be around to chat about interpretability, multilingual NLP, robustness, and syntax. Get in touch via DMs or email! arxiv.org/abs/2210.14328













