I was a Research Scientist at Meta AI's FAIR Alignment group until November 2025, working on LLM privacy, security, and learning about chemistry! Beginning Fall 2026, I will join Carnegie Mellon University's Engineering & Public Policy (EPP) Department and Language Technologies Institute (LTI) as an Assistant Professor, and will be a core member of CyLab.
My research interests are privacy, natural language processing, AI for science, LLM reasoning, and the societal implications of ML. I explore the interplay between data, its influence on models, and the expectations of the people who regulate and use these models. My work has been recognized by the NCWIT Collegiate Award and the Rising Star in Adversarial ML Award.
Previously, I was a postdoctoral scholar at University of Washington, advised by Yejin Choi and Yulia Tsvetkov. I received my PhD from UC San Diego, advised by Taylor Berg-Kirkpatrick, and during that time I was also a part-time researcher / intern at Microsoft Research—working with the Privacy in AI, Algorithms, and Semantic Machines teams on differential privacy, model compression, and data synthesis.
Recruiting & collaborations: If you are interested in working with me, please fill out this brief form .
✦ Explanation about my name: I used to publish under Fatemeh which is my legal name. But I now go by Niloofar, the Lily flower in Farsi!
✦ My academic Job-market material (Fall 2024): Research statement · Teaching statement · DEI statement · CV · Job-talk slides
News Highlights
Appeared on The Information Bottleneck podcast (Jan 2026) with Ravid Shwartz-Ziv & Allen Roush: discussed the future of generative AI, how it's reshaping creative work, accelerating scientific discovery, and the ethical frontier of AI.
Gave a talk at the FAR AI San Diego Alignment Workshop at NeurIPS (Dec 2025): "What Does It Mean for Agentic AI to Preserve Privacy?"
CIMemories: A Compositional Benchmark for Contextual Integrity of Persistent Memory in LLMs — Check out the dataset on Hugging Face!
Featured in Science News Explores (Nov 2025): "5 things to remember when talking to a chatbot" — on AI privacy risks and how chatbots handle personal information.
Our write-up "Privacy Is Not Just Memorization" with Tianshi Li is now available! Featured in Help Net Security (Oct 2025).
Gave a keynote at CAMLIS 2025 (Oct 2025): "What Does It Mean for Agentic AI to Preserve Privacy? Mapping the New Data Sinks and Leaks" — Video, slides, and reading list
Quoted in the Washington Post (Aug 2025) on AI hype, evaluation metrics, and how people judge AI capabilities.
Gave a keynote at the L2M2 (Large Language Model Memorization) workshop at ACL (Aug 2025): "Emergent Misalignment Through the Lens of Non-verbatim Memorization"
Gave a keynote at the LLMSec workshop at ACL (Aug 2025): "What does it mean for an AI agent to preserve privacy?" — slides
Our position paper "Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice" accepted as Oral at NeurIPS 2025!
Appeared on the Jay Shah Podcast (Feb 2025): "Differential Privacy, Creativity & Future of AI Research in the LLM Era"
Gave an invited keynote at NeurIPS 2024 Red Teaming GenAI workshop (Dec 2024) on A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs — slides and recording (jump to 04:50:00).
Appeared on the Thesis Review podcast with Sean Welleck where I talked about my work on Auditing and Mitigating Safety Risks in Large Language Models.
Selected Publications
For the full list, please refer to my Google Scholar page.
-
NeurIPS 2025 (Oral Presentation)
A. F. Cooper, ..., N. Mireshghallah, ..., K. Lee
-
CIMemories: A Compositional Benchmark for Contextual Integrity of Persistent Memory in LLMs
2025 — Dataset on HuggingFace
N. Mireshghallah, et al.
-
Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs
2025
R. Zhang, M. Kaniselvan, N. Mireshghallah
-
RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows
NeurIPS 2025 Workshop MATH-AI
H. Mahdavi, N. Mireshghallah, ..., V. Honavar
-
A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage
SaTML 2026
R. Xin*, N. Mireshghallah*, S. S. Li, M. Duan, H. Kim, Y. Choi, Y. Tsvetkov, S. Oh, P. W. Koh
-
ICLR 2025 (Oral Presentation)
X. Lu, M. Sclar, S. Hallinan, N. Mireshghallah, J. Liu, S. Han, A. Ettinger, L. Jiang, K. Chandu, N. Dziri, Y. Choi
-
Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models
NeurIPS 2025
J. Hayes, ..., N. Mireshghallah, ..., A. F. Cooper
-
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human–AI Interactions
COLM 2025
X. Zhou, ..., N. Mireshghallah, ..., M. Sap
-
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
NAACL 2025
A. Kassem*, O. Mahmoud*, N. Mireshghallah*, H. Kim, Y. Tsvetkov, Y. Choi, S. Saad, S. Rana
-
Differentially Private Learning Needs Better Model Initialization and Self-Distillation
NAACL 2025 (Oral Presentation)
I. Ngong, J. Near, N. Mireshghallah
-
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models
NAACL 2025 (Honorable Mention Candidate, Oral Presentation)
A. Ravichander, ..., N. Mireshghallah, ..., Y. Choi
-
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NeurIPS 2024
L. Jiang, K. Rao, S. Han, A. Ettinger, F. Brahman, S. Kumar, N. Mireshghallah, X. Lu, M. Sap, Y. Choi, N. Dziri
-
EMNLP 2024
T. Chen, N. Mireshghallah*, A. Asai*, S. Min, J. Grimmelmann, Y. Choi, H. Hajishirzi, L. Zettlemoyer, P. W. Koh
-
Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
COLM 2024
N. Mireshghallah*, M. Antoniak*, Y. More*, Y. Choi, G. Farnadi
-
Do membership inference attacks work on large language models?
COLM 2024
M. Duan, A. Suri, N. Mireshghallah, S. Min, W. Shi, L. Zettlemoyer, Y. Tsvetkov, Y. Choi, D. Evans, H. Hajishirzi
-
Machine Unlearning Doesn't Do What You Think
Extended Abstract at GenLaw 2024
K. Lee, A. F. Cooper, C. A. Choquette-Choo, K. Liu, M. Jagielski, N. Mireshghallah, L. Ahmed, J. Grimmelmann, D. Bau, C. De Sa, et al.
-
A Roadmap to Pluralistic Alignment
ICML 2024
T. Sorensen, J. Moore, J. Fisher, M. Gordon, N. Mireshghallah, C. M. Rytting, A. Ye, L. Jiang, X. Lu, N. Dziri, T. Althoff, Y. Choi
-
ICLR 2024
N. Mireshghallah*, H. Kim*, X. Zhou, Y. Tsvetkov, M. Sap, R. Shokri, Y. Choi
-
Privacy-preserving in-context learning with differentially private few-shot generation
ICLR 2024
X. Tang, R. Shin, H. A. Inan, A. Manoel, N. Mireshghallah,Z. Lin, S. Gopi, J. Kulkarni, R. Sim
-
Smaller Language Models are Better Black-box Machine-Generated Text Detectors
EACL 2024
N. Mireshghallah, J. Mattern, S. Gao, R. Shokri, T. Berg-Kirkpatrick
-
Privacy-Preserving Domain Adaptation of Semantic Parsers
ACL 2023
N. Mireshghallah, R. Shin, Y. Su, T. Hashimoto, J. Eisner
-
Non-Parametric Temporal Adaptation for Social Media Topic Classification
EMNLP 2023
N. Mireshghallah*, N. Vogler*, J. He, O. Florez, A. El-Kishky, T. Berg-Kirkpatrick
-
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation
CoNLL 2023
J. Forristal, N. Mireshghallah, G. Durrett, T. Berg-Kirkpatrick
-
Membership Inference Attacks against Language Models via Neighbourhood Comparison
ACL 2023
J. Mattern, N. Mireshghallah, Z. Jin, B. Scholkop, M. Sachan, T. Berg-Kirkpatrick
-
Differentially Private Model Compression
NeurIPS 2022
N. Mireshghallah, A. Backurs, H. A. Inan, L. Wutschitz, J. Kulkarni
-
Memorization in NLP Fine-tuning Methods
EMNLP 2022
N. Mireshghallah, A. Uniyal, T. Wang, D. Evans, T. Berg-Kirkpatrick
-
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks
EMNLP 2022
N. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, R. Shokri
-
NAACL 2022
N. Mireshghallah, V. Shrivastava, M. Shokouhi, T. Berg-Kirkpatrick, R. Sim, D. Dimitriadis
-
What Does it Mean for a Language Model to Preserve Privacy?
FAccT 2022
H. Brown, K. Lee, N. Mireshghallah, R. Shokri, F. Tram'er
-
Mix and Match: Learning-free Controllable Text Generation
ACL 2022
N. Mireshghallah, K. Goyal, T. Berg-Kirkpatrick
-
Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness
EMNLP 2021
N. Mireshghallah, T. Berg-Kirkpatrick
-
Privacy Regularization: Joint Privacy-Utility Optimization in Language Models
NAACL 2021
N. Mireshghallah, H. Inan, M. Hasegawa, V. Rühle, T. Berg-Kirkpatrick, R. Sim
-
ICML 2020
A. Elthakeb, P. Pilligundla, N. Mireshghallah, A. Cloninger, H. Esmaeilzadeh
-
Not All Features Are Equal: Discovering Essential Features for Preserving Prediction Privacy
WWW 2021
N. Mireshghallah, M. Taram, A. Jalali, A. T. Elthakeb, D. Tullsen, H. Esmaeilzadeh
-
Shredder: Learning Noise Distributions to Protect Inference Privacy
ASPLOS 2020
N. Mireshghallah, M. Taram, A. Jalali, D. Tullsen, H. Esmaeilzadeh
Invited Talks
-
FAR AI San Diego Alignment Workshop at NeurIPS
Workshop Keynote, Dec. 2025
What Does It Mean for Agentic AI to Preserve Privacy?
-
Keynote, Oct. 2025
What Does It Mean for Agentic AI to Preserve Privacy? Mapping the New Data Sinks and Leaks
-
Cornell Tech Digital Life Seminar
Seminar, Oct. 2025
Contextual Privacy in LLMs: Benchmarking and Mitigating Inference-Time Risks
-
First Workshop on LLM Security (LLMSec) at ACL 2025
Keynote, Aug. 2025
What Does It Mean for Agentic AI to Preserve Privacy?
-
First Workshop on Large Language Model Memorization (L2M2) at ACL 2025
Keynote, Aug. 2025
Emergent Misalignment Through the Lens of Semantic Memorization
-
Workshop on Collaborative and Federated Agentic Workflows (CFAgentic) at ICML 2025
Invited Talk, July 2025
What Does It Mean for Agentic AI to Preserve Privacy?
-
Fifth Workshop on Trustworthy Natural Language Processing @NAACL 2025 (TrustNLP)
Invited Talk, May 2025
-
Stanford University (NLP Seminar)
NLP Seminar, Jan. 2025
Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI
-
University of California, Los Angeles
Guest lecture for CS 269 - Computational Ethics, LLMs and the Future of NLP, Jan. 2025
Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI
-
NeurIPS Conference (Red Teaming GenAI workshop)
Red Teaming GenAI workshop, Dec. 2024
A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs
-
NeurIPS Conference (PrivacyML Tutorial)
Panelist, Dec. 2024
PrivacyML: Meaningful Privacy-Preserving Machine Learning tutorial
Recording (jump to 01:52:00)
-
Johns Hopkins University
CS Department Seminar, Dec. 2024
Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI
-
Future of Privacy Forum
Panelist, Nov. 2024
Technologist Roundtable for Policymakers: Key Issues in Privacy and AI
-
University of Utah
Guest lecture for the School of Computing CS 6340/5340 NLP course, Nov. 2024
Can LLMs Keep a Secret?
-
UMass Amherst
NLP Seminar, Oct. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
Northeastern University
Khoury College of Computer Sciences Security Seminar, Oct. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
Stanford Research Institute (SRI) International
Computational Cybersecurity in Compromised Environments (C3E) workshop, Sep. 2024
Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity
-
LinkedIn Research
Privacy Tech Talk, Sep. 2024
Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity
-
National Academies (NASEM)
Forum on Cyber Resilience, Aug. 2024
Oversharing with LLMs is underrated: the curious case of personal disclosures in human-LLM conversations
-
ML Collective
DLCT reading group, Aug. 2024
Privacy in LLMs: Understanding what data is imprinted in LMs and how it might surface!
-
Carnegie Mellon University
Invited Talk, Jun. 2024
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
-
Generative AI and Law workshop, Washington DC
Invited Talk, Apr. 2024
What is differential privacy? And what is it not?
-
Meta AI Research
Invited Talk, Apr. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
Georgia Institute of Technology
Guest lecture for the School of Interactive Computing, Apr. 2024
Safety in LLMs: Privacy and Memorization
-
University of Washington
Guest lecture for CSE 484 and 582 courses on Computer Security and Ethics in AI, Apr. 2024
Safety in LLMs: Privacy and Memorization
-
Carnegie Mellon University
Guest lecture for LTI 11-830 course on Computational Ethics in NLP, Mar. 2024
Safety in LLMs: Privacy and Memorization
-
Simons Collaboration
TOC4Fairness Seminar, Mar. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
University of California, Santa Barbara
NLP Seminar Invited Talk, Mar. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of California, Los Angeles
NLP Seminar Invited Talk, Mar. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of Texas at Austin
Guest lecture for LIN 393 course on Social Applications and Impact of NLP, Feb. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
Google Brain
Google Tech Talk, Feb. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of Washington
Allen School Colloquium, Jan. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of Washington
eScience Institute Seminars, Nov. 2023
Privacy Auditing and Protection in Large Language Model
-
CISPA Helmholtz Center for Security
Invited Talk, Sep. 2023
What does privacy-preserving NLP entail?
-
Max Planck Institute for Software Systems
Next 10 in AI Series, Sep. 2023
Auditing and Mitigating Safety Risks in LLMs
-
Mila / McGill University
Invited Talk, May 2023
Privacy Auditing and Protection in Large Language Models
-
EACL 2023
Tutorial co-instruction, May 2023
Private NLP: Federated Learning and Privacy Regularization
-
LLM Interfaces Workshop and Hackathon
Invited Talk, Apr. 2023
Learning-free Controllable Text Generation
-
University of Washington
Invited Talk, Apr. 2023
Auditing and Mitigating Safety Risks in Large Language Models
-
NDSS Conference
Keynote talk for EthiCS workshop, Feb. 2023
How much can we trust large language models?
-
Google
Federated Learning Seminar, Feb. 2023
Privacy Auditing and Protection in Large Language Models
-
University of Texas Austin
Invited Talk, Oct. 2022
How much can we trust large language models?
-
Johns Hopkins University
Guest lecture for CS 601.670 course on Artificial Agents, Sep. 2022
Mix and Match: Learning-free Controllable Text Generation
-
KDD Conference
Adversarial ML workshop, Aug. 2022
How much can we trust large language models?
-
Microsoft Research Cambridge
Invited Talk, Mar. 2022
What Does it Mean for a Language Model to Preserve Privacy?
-
University of Maine
Guest lecture for COS435/535 course on Information Privacy Engineering, Dec. 2021
Improving Attribute Privacy and Fairness for Natural Language Processing
-
National University of Singapore
Invited Talk, Nov. 2021
Style Pooling: Automatic Text Style Obfuscation for Fairness
-
Big Science for Large Language Models
Invited Panelist, Oct. 2021
Privacy-Preserving Natural Language Processing
-
Research Society MIT Manipal
Cognizance Event Invited Talk, Jul. 2021
Privacy and Interpretability of DNN Inference
-
Alan Turing Institute
Privacy and Security in ML Seminars, Jun. 2021
Low-overhead Techniques for Privacy and Fairness of DNNs
-
Split Learning Workshop
Invited Talk, Mar. 2021
Shredder: Learning Noise Distributions to Protect Inference Privacy
-
University of Massachusetts Amherst
Machine Learning and Friends Lunch, Oct. 2020
Privacy and Fairness in DNN Inference
-
OpenMined Privacy Conference
Invited Talk, Sep. 2020
Privacy-Preserving Natural Language Processing
-
Microsoft Research AI
Breakthroughs Workshop, Sep. 2020
Private Text Generation through Regularization
Awards and Honors
Tinker Academic Research Compute Grant, 2025
Modal Academic Research Compute Grant, 2025
Momental Foundation Mistletoe Research Fellowship (MRF) Finalist, 2023
Rising Star in Adversarial Machine Learning (AdvML) Award Winner, 2022. AdvML Workshop
Rising Stars in EECS, 2022. Event Page
UCSD CSE Excellence in Leadership and Service Award Winner, 2022
FAccT Doctoral Consortium, 2022. FAccT 2022
Qualcomm Innovation Fellowship Finalist, 2021. Fellowship Page
NCWIT (National Center for Women & IT) Collegiate Award Winner, 2020. NCWIT Awards
National University Entrance Exam in Math, 2014. Ranked 249th of 223,000
National University Entrance Exam in Foreign Languages, 2014. Ranked 57th of 119,000
National Organization for Exceptional Talents (NODET), 2008. Admitted, ~2% Acceptance Rate
Featured Press & Media
The Information Bottleneck podcast (Jan 2026) — on the future of generative AI with Ravid Shwartz-Ziv & Allen Roush
Jay Shah Podcast (Feb 2025) — on Differential Privacy, Creativity & Future of AI Research
Thesis Review podcast with Sean Welleck — on Auditing and Mitigating Safety Risks in LLMs
Should I do a postdoc — guest video on Sasha Rush's channel + blog post
Recent Co-organized Workshops & Service
[for full list check my CV]Memorization and Trustworthy Foundation Models Workshop @ICML 2025 (Co-organizer)
Area Chair for COLM 2025 & 2026
Workshop on Technical AI Governance (TAIG) @ICML 2025 (Panelist)
Workshop on Collaborative and Federated Agentic Workflows (CFAgentic) @ICML 2025 (Panelist)
Privacy Session Chair at SAGAI Workshop @IEEE S&P 2025
Industry Research Experience
-
Microsoft Semantic Machines
Fall 2022-Fall 2023 (Part-time), Summer 2022 (Intern)
Mentors: Richard Shin, Yu Su, Tatsunori Hashimoto, Jason Eisner
-
Microsoft Research, Algorithms Group, Redmond Lab
Winter 2022 (Intern)
Mentors: Sergey Yekhanin, Arturs Backurs
-
Microsoft Research, Language, Learning and Privacy Group, Redmond Lab
Summer 2021 (Intern), Summer 2020 (Intern)
Mentors: Dimitrios Dimitriadis, Robert Sim
-
Western Digital Co. Research and Development
Summer 2019 (Intern)
Mentor: Anand Kulkarni
Diversity, Inclusion & Mentorship
Mentor for Women in Machine Learning (WiML) Workshop at NeurIPS 2025
Panelist at CMU School of Computer Science Panel: Navigating the Academic Job Market (2025)
Mentor on the 'How to broadcast your research to a wider audience?' panel at ACL Mentorship Program -- 2025
Mentor for the mentorship program at WiML event in NeurIPS 2024
D&I chair at NAACL 2025
Widening NLP (WiNLP) co-chair
Socio-cultural D&I chair at NAACL 2022
Mentor for the Graduate Women in Computing (GradWIC) at UCSD
Mentor for the UC San Diego Women Organization for Research Mentoring (WORM) in STEM
Co-leader for the "Feminist Perspectives for Machine Learning & Computer Vision" Break-out session at the Women in Machine Learning (WiML) 2020 Un-workshop Held at ICML 2020
Mentor for the USENIX Security 2020 Undergraduate Mentorship Program
Volunteer at the Women in Machine Learning 2019 Workshop Held at NeurIPS 2019
Invited Speaker at the Women in Machine Learning and Data Science (WiMLDS) NeurIPS 2019 Meetup
Mentor for the UCSD CSE Early Research Scholars Program (CSE-ERSP) in 2018
Professional Services
[Outdated, for an updated version check my CV]Reviewer for ICLR 2022
Reviewer for NeurIPS 2021
Reviewer for ICML 2021
Shadow PC member for IEEE Security and Privacy Conference Winter 2021
Artifact Evaluation Program Committee Member for USENIX Security 2021
Reviewer for ICLR 2021 Conference
Program Committee member for the LatinX in AI Research Workshop at ICML 2020 (LXAI)
Reviewer for the 2020 Workshop on Human Interpretability in Machine Learning (WHI) at ICML 2020
Program Committee member for the MLArchSys workshop at ISCA 2020
Security & Privacy Committee Member and Session Chair for Grace Hopper Celebration (GHC) 2020
GHC (Grace Hopper Celebration) 2020 Privacy and Security Committee Member
Reviewer for ICML 2020 Conference
Artifact Evaluation Program Committee Member for ASPLOS 2020
Reviewer for IEEE TC Journal
Reviewer for ACM TACO Journal
Books I Like!
Range: Why Generalists Triumph in a Specialized World by D. Epstein
Messy: The Power of Disorder to Transform Our Lives by T. Harford
Small Is Beautiful: Economics As If People Mattered by E. F. Schumacher
Quarter-life by Satya Doyle Byock
The Body Keeps the Score by Bessel van der Kolk
36 Views of Mount Fuji by Cathy Davidson
Indistractable by Nir Eyal
Sapiens: A Brief History of Humankind by Yuval Noah Harari
The Martian by Andy Weir
The Solitaire Mystery by Jostein Gaarder
The Orange Girl by Jostein Gaarder
Life is Short: A Letter to St Augustine by Jostein Gaarder
The Alchemist by Paulo Coelho