Eunsol Choi (@eunsolc) / X

Eunsol Choi

144 posts

Eunsol Choi

@eunsolc

on natural language processing / machine learning. assistant prof at @NYUDataScience @NYU_Courant prev @UTCompSci @googleai, @uwcse, @Cornell.

eunsol.github.io

Joined September 2016

Pinned
Eunsol Choi
@eunsolc
Aug 16, 2024
My lab will move to @NYUDataScience and @NYU_Courant this Fall! I’m excited to connect with amazing researchers at @CILVRatNYU and larger ML/NLP community in NYC. I will be recruiting students this cycle at NYU. Happy to be back to the city 🗽on the east coast as well. I had a
NYU Center for Data Science
@NYUDataScience
Aug 16, 2024
CDS welcomes Eunsol Choi (@eunsolc) as an Assistant Professor of Computer Science (@NYU_Courant) and Data Science! Her research focuses on advancing how computers interpret human language in real-world contexts. nyudatascience.medium.com/meet-the-facul…
91K
Eunsol Choi
@eunsolc
Jul 8, 2019
Eunsol: Seattle -> NYC -> Austin! Excited to share the news that I will join @UTCompSci as an assistant professor next fall, being part of a growing NLP community in Austin, after a year at @GoogleAI with cool folks.
Eunsol Choi
@eunsolc
Jun 8, 2021
TACL paper, to be presented at @NAACLHLT tmr! When sentences appear in a document, they use pronouns and implicit arguments whose meaning is obtained from the broader context. Given a sentence and a context, can we re-write the sentence so that it can stand alone? [1 / 4]
Eunsol Choi
@eunsolc
Apr 20, 2020
New collaboration at @GoogleAI. How can we better integrate and retrieve entity information in transformers?
Thibault Févry
@iwontbecreative
Apr 20, 2020
“Entities as Experts: Sparse Memory Access with Entity Supervision” introduces an entity-centric memory network in a transformer. EaE’s memory access and entity representations are supervised through entity linking. arxiv.org/abs/2004.07202
Eunsol Choi
@eunsolc
Nov 8, 2021
#EMNLP2021 How should we distribute our annotation budget to create training data? We find collecting *multiple* labels per example can be more beneficial than annotating as many examples as possible. Why? noisy annotation and the ambiguity of language.
Shujian Zhang
@zhang_shujian
Nov 8, 2021
Training NLP systems typically assume annotated data that has a *single* human label per example. Our EMNLP (𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝗗𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗔𝗺𝗼𝘂𝗻𝘁𝘀 𝗼𝗳 𝗔𝗻𝗻𝗼𝘁𝗮𝘁𝗶𝗼𝗻: 𝗙𝗿𝗼𝗺 𝗭𝗲𝗿𝗼 𝘁𝗼 𝗠𝗮𝗻𝘆 𝗟𝗮𝗯𝗲𝗹𝘀) paper proposes a learning algorithm.
Eunsol Choi
@eunsolc
Dec 6, 2022
Excited to be at #EMNLP2022! My first in-person conference with my students from @UTCompSci. :) Many things throughout the week (all times local) (1/N)
Eunsol Choi
@eunsolc
May 3, 2023
Language models are increasingly being used for knowledge-rich tasks. How well can we update the knowledge in LMs? We find that we can update model parameters to memorize facts, but models do not always make inferences based on the injected facts. See our paper at #ACL2023NLP
Yasumasa Onoe
@yasumasa_onoe
May 3, 2023
Knowledge in LMs can go out of date. Our #ACL2023NLP paper investigates teaching LMs about new entities via definitions, and whether LMs can then make inferences that go beyond those definitions. arxiv.org/pdf/2305.01651… w/@mjqzhang, @shankarpad8, @gregd_nlp, @eunsolc
15K
Eunsol Choi
@eunsolc
Oct 9, 2023
We improve retrieval augmented LMs by introducing a "compression" step before prepending. While our initial focus was on efficiency, we found targeted prepending can also improve downstream performance, as it filters noisy retrieval outputs!
Fangyuan Xu
@brunchavecmoi
Oct 9, 2023
🔌Enhancing language models with retrieval boosts performance but demands more computes for encoding the retrieved documents. Do we need all the documents for the gains? We present 𝐑etrieve 𝐂𝐨𝐦press 𝐏repend (𝐑𝐄𝐂𝐎𝐌𝐏) arxiv.org/abs/2310.04408 (w/@WeijiaShi2, @eunsolc)
16K
Eunsol Choi
@eunsolc
May 23, 2023
I've been working on QA for a while, but this is the first work where we actually learn from real user feedback instead of annotated data! Learning from user interaction is challenging but also promising given how widely NLP systems are getting deployed these days.
13K
Eunsol Choi
@eunsolc
Apr 22, 2024
Can LLM comprehensively capture information spread across multiple documents? Can LLM distinguish confusing entity mentions? Please check out our preprint on multi-document reasoning for LLM, focusing on entity disambiguation!
Yoonsang Lee
@yoonsang_
Apr 22, 2024
Can LMs correctly distinguish🔎 confusing entity mentions in multiple documents? We study how current LMs perform QA task when provided ambiguous questions and a document set📚 that requires challenging entity disambiguation. Work done at @UTCompSci✨ w/ @xiye_nlp, @eunsolc
11K
Eunsol Choi
@eunsolc
Oct 11, 2024
We studied retrieval diversity on subjective questions with different types of corpus (Wikipedia, web snapshot, search results)! This project made me think a lot about the future of retrieval system evaluations.
Hung-Ting Chen
@hungting_chen
Oct 11, 2024
🚨New Paper🚨: We introduce BERDS, a BEnchmark for Retrieval Diversity, for subjective questions. We collect subjective questions with diverse perspectives and develop evaluation metrics to measure retrieval diversity in an open-world setting. Work done w/ @eunsolc! 🧵
6.1K
Eunsol Choi
@eunsolc
Nov 16, 2023
My first prompting paper 👋 We link LM's parametric knowledge to the construction of in-context examples. If an LM lacks knowledge for in-context examples, could it result in hallucinations? If an LM can easily answer, would it make educated guesses on challenging queries?
Yoonsang Lee
@yoonsang_
Nov 16, 2023
Known example❗️ or Unknown example❓ to prompt an LM? We propose best practices for crafting ✏️ in-context examples according to LMs' parametric knowledge 📚. Work done at @UTCompSci ✨ w/ @pranav_atreya, @xiye_nlp, @eunsolc lilys012.github.io/assets/pdf/cra…
12K
Eunsol Choi
@eunsolc
Jun 24, 2024
New preprint on building language-specific LLMs! Out of the box, most LLMs are not very effective at handling low-resource languages, but after token augmentation and a moderate amount of fine-tuning, their performance improves significantly. We look into various design choices.
Atula Tejaswi
@atu_tej
Jun 24, 2024
1/ 🎉 Excited to share our latest paper: "Exploring Design Choices for Building Language-Specific LLMs" 📄. We explore adaptation of monolingual and multilingual large language models for specializing to a particular language 🌐🚀 \w @convexlull @eunsolc
7.7K
Eunsol Choi
@eunsolc
Mar 28, 2022
Was fun studying long-form question answering! The task is really rich yet challenging, and understanding how answers are structured by *humans* can be a good step for both modeling and evaluation. We looked into @OpenAI WebGPT, ELI5, and NQ answers!
Fangyuan Xu
@brunchavecmoi
Mar 28, 2022
🤔 How do we answer complex questions, such as “How much money is needed in order to not have to work for the rest of your life”? We (w/ @eunsolc and @jessyjli) study the discourse structure of long-form answers. #ACL2022 Paper: arxiv.org/abs/2203.11048