user avatar
Nicholas Tomlin
@NickATomlin
Incoming assistant professor at TTIC, current faculty fellow at NYU CDS, and previous PhD student at Berkeley. Natural language processing. He/him.
New York, NY
Joined November 2013
Posts
  • Pinned
    user avatar
    New paper! LLM memory keeps improving, but this makes them *worse* as user sims. If we want to build models that can, e.g., simulate realistic students to train chatbots to be better teachers, then these models need to be able to forget like humans do 📄: arxiv.org/abs/2605.25680
    Title page of our paper, "Simulating Human Memory with Language Models"
  • user avatar
    I’m at #NAACL2022 in Seattle! There is a donut wall at the conference! Come talk to me about grounded language and donuts.
    A photo of me standing in front of a wall of donuts, holding a pair of tongs and a plate and reaching for a chocolate sprinkles donut. In the photo I am wearing a hat from Oliboli donut shop in Tustin, California.
  • user avatar
    I'm incredibly excited to share that I'll be joining @TTIC_Connect as an assistant professor in Fall 2026! Until then, I'm wrapping up my PhD at Berkeley, and after that I'll be a faculty fellow at @NYUDataScience
  • user avatar
    COLM is the best conference I have attended throughout the entirety of my PhD, really looking forward to future iterations
  • user avatar
    The long-term goal of AI is to build models that can handle arbitrary tasks, not just ones they’ve been trained on. We hope our new *benchmark generator* can help measure progress toward this vision
    Title and abstract of the paper, "Measuring General Intelligence with Generated Games"
    🎮 Excited to announce gg-bench, a fully synthetic benchmark for LLMs consisting of games generated entirely by LLMs!! This benchmark centers around the fact that LLMs are capable of generating complex tasks that they themselves cannot even solve. 📄: arxiv.org/abs/2505.07215
  • user avatar
    New preprint! 📰 Can LMs be improved with AlphaGo-style self-play? The classic answer is that self-play only works in certain types of zero-sum games, but we show that it can be effective in cooperative games too Paper: arxiv.org/abs/2406.18872 Code: github.com/nickatomlin/lm…
    Screenshot of the first page of the paper, "Efficacy of Language Model Self-Play in Non-Zero-Sum Games"
  • user avatar
    LLMs can facilitate student cheating, spread misinformation on the web, and even poison future training datasets. Today, we’re releasing Ghostbuster, a state-of-the-art method for detecting LLM-generated text. Paper: arxiv.org/abs/2305.15047 Try it: ghostbuster.app
    GIF
  • user avatar
    Excited to share some new work at #ACL2022!! We train probes to investigate what "concepts" are encoded in game-playing agents like AlphaGo and how those concepts relate to natural language: arxiv.org/abs/2204.07531 [1/5]
  • user avatar
    Excited to announce that I’ll be pursuing a PhD in Computer Science at @berkeley_ai, where I’ll be focusing on computational linguistics!
  • user avatar
    This is what research meetings with my students look like (@austen_liao)
    Two people sitting at a desk in front of a computer, one of whom is me. Across from the desk is one of my undergrad advisees dressed in a Kirby costume
    Another photo of the Kirby, standing in the hallway outside Dan Klein’s office
    The Kirby looking at a computer with VSCode opened
  • user avatar
    I’m at #ACL2024 in Bangkok! Currently excited about: language agents, interaction + reasoning, and this giant lizard
    00:00
  • user avatar
    I’m teaching assistant for a large (~200 student) NLP course that runs assignments on @GoogleColab, but we’re having lots of issues with GPU availability (way worse this year than it used to be). How do other schools that run courses on Colab deal w/ this?
    Colab error message: “Cannot connect to GPU backend”
  • user avatar
    Super excited by this result! I’ve been working on crosswords for a while and I’m incredibly happy to see this project finally come to fruition. More info, paper, and hopefully a web demo coming soon!
    AI systems can now solve crosswords better than humans. The Berkeley Crossword Solver from @BerkeleyNLP helped propel Dr.Fill, an AI program created by @mattlginsberg, to 1st at #ACPT2021, the premier crossword tournament, solving the playoff puzzle in 49 seconds with no errors.
  • user avatar
    On the way to Ireland for my first in-person ACL conference! I'll be presenting two papers at the language x games intersection, come say hi Go: arxiv.org/abs/2204.07531 Crosswords: arxiv.org/abs/2205.09665
    Photo of the airport runway at SFO