arxiv.org/abs/2205.14217
We propose Diffusion-LM, a non-autoregressive language model based on continuous diffusions. It enables complex controllable generation. We can steer the LM to generate text with desired syntax structure ( [S [NP...VP…]]) and semantic content (name=Coupa)
Xiang Lisa Li
40 posts
- arxiv.org/abs/2210.15097 We propose contrastive decoding (CD), a more reliable search objective for text generation by contrasting LMs of different sizes. CD takes a large LM (expert LM e.g. OPT-13b) and a small LM (amateur LM e.g. OPT-125m) and maximizes their logprob difference
- arxiv.org/abs/2407.08351 LM performance on existing benchmarks is highly correlated. How do we build novel benchmarks that reveal previously unknown trends? We propose AutoBencher: it casts benchmark creation as an optimization problem with a novelty term in the objective.
- Can we get language models to exhibit certain behaviors? We train investigator models to elicit target behaviors from LMs, which helps us proactively detect harmful responses and hallucination!Excited to finally share what I’ve been up to at @TransluceAI: training Investigator Agents to elicit behaviors in LMs (including harmful responses and hallucinations)!
- Replying to @adveisnerI am so happy and honored to be working with you. Thanks for introducing the field to me when I was a sophomore, and let me be part of Argo. Huge appreciation for all the research advising, enlightening discussions, writing tips, presentation tips, etc.
- Replying to @XiangLisaLi2Exciting joint work with @jwthickstun @__ishaan @percyliang @tatsu_hashimoto 🙂 Code available at github.com/XiangLi1999/Di… Diffusion-LM shows strong performance in controllable generation, but it remains an open question whether it could match autoregressive LMs in PPL and speed.
- I enjoyed chatting with @pdasigi and @anmarasovic about my paper with @percyliang on prefix-tuning. Thanks for the invitation and I am very grateful to have this opportunity to talk about my work! 😀#nlphighlights 126: We invited Lisa Li (@XiangLisaLi2) to talk about Prefix-tuning, her recently proposed efficient alternative to finetuning. @anmarasovic and I had a great time discussing this interesting work with Lisa. soundcloud.com/nlp-highlights…
- Replying to @XiangLisaLi2Continuous diffusions have been successful for images (DDPM, DALL-E2), but text data is hard due to its discreteness. We add embedding and rounding to the standard diffusion model through an end-to-end objective for learning embeddings and a clamping technique for rounding.
- Replying to @XiangLisaLi2Contrastive decoding is inspired by the observation that the failures of larger LMs are even more prevalent in smaller LMs (e.g., repetition, incoherence), and that this difference signals exactly which texts should be avoided/prefered.
- Replying to @XiangLisaLi2Exciting joint work with @universeinanegg @dan_fried @percyliang @adveisner @tatsu_hashimoto @LukeZettlemoyer @ml_perception 🙂 Code available at
- Replying to @XiangLisaLi2We consider 6 control tasks (e.g., semantic content, syntactic structures). Our method yields 2x the success rate of previous plug-and-play methods, often matches the fine-tuning oracle, and can even compose multiple controls at once.
- Replying to @XiangLisaLi2CD requires zero training, and produces higher quality text than decoding from the larger LM alone. It also generalizes across model types (OPT, GPT2) and scales (1.5b, 6.7b, 13b) and significantly outperforms four strong decoding algorithms in automatic and human evaluations.
- Replying to @XiangLisaLi2
- Replying to @XiangLisaLi2We use AutoBencher to find knowledge gaps in LLMs. it proposes evaluation topics; constructs high-quality QA datasets using additional information (e.g. retrieval and tools); and computes novelty scores of the datasets to inform the proposal of new evaluation topics.











