user avatar
Naomi Saphra
@nsaphra
Waiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account. Accepting ML/NLP PhD students.
Boston
Joined November 2010
Posts
  • Pinned
    user avatar
    Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!
  • user avatar
    As confirmed by the new IMO rankings, Grok 4’s eye-popping benchmarks were driving by the following innovations: - train on test - train on test - train on test
  • user avatar
    Regular reminder of the best mathematical resource in machine learning, The Matrix Cookbook. Don't know how anyone ever does any math without it. math.uwaterloo.ca/~hwolkowi/matr…
  • user avatar
    My hobby: watching underpaid, overworked engineers sacrificing their 20s to an early stage startup ridicule people who buy lottery tickets.
  • user avatar
    What idiot called it "deep learning hype" and not "backpropaganda"
  • user avatar
    Why isn't color-coding explanations more common?
  • user avatar
  • user avatar
    Have you ever noticed how Chinese and American researchers both publish at #NeurIPS, go to the same conference---and then barely cite or talk to each other? @BingchenZhao @gu_yuling @in4dmatics & I have, and we'll be presenting on it at @AiCultures! arxiv.org/pdf/2211.12424…
    One Venue, Two Conferences: The Separation of Chinese and
American Citation Networks

Abstract
At NeurIPS, American and Chinese institutions cite papers from each other’s regions substantially less than they cite endogamously. We build a citation graph to quantify this divide, compare
it to European connectivity, and discuss the causes and consequences of the separation.
  • user avatar
    I’m recruiting PhD students for 2026! If you are interested in robustness, training dynamics, interpretability for scientific understanding, or the science of LLM analysis you should apply. BU is building a huge LLM analysis/interp group and you’ll be joining at the ground floor.
    Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!
  • user avatar
    Did you know the Fisher Information Matrix is the second-order Taylor approximation ... to KL divergence??????????????????????????????????????????????? I'm shaking idk how to handle this. what a good fact
  • user avatar
    Replying to @nsaphra
    tfw you could have joined a bleeding edge LLM lab but you were too desperate to train a substandard nazi waifu on the test set
  • user avatar
    Just got a desk reject, post-rebuttals, for a paper being submitted to arxiv <30 min late for the anonymity deadline. I talk about how the ACL embargo policy hurts junior researchers and makes ACL venues less desirable for NLP work. I don’t talk about the pointless NOISE it adds.
  • user avatar
    Finally at the stage of writing a PhD thesis where I get to settle 20-year-old grudges.
  • user avatar
    It's not the first time! A dream team of @enfleisig (human eval expert), Adam Lopez (remembers the Stat MT era), @kchonyc (helped end it), and me (pun in title) are here to teach you the history of scale crises and what lessons we can take from them. 🧵arxiv.org/abs/2311.05020
    Replying to @andriy_mulyar
    my Twitter feed is full of ph.d. students having an existential crisis