Pinned
Kianté Brantley
2,665 posts
Joined May 2009
- I am recruiting PhD students to join my lab at Harvard in Fall 2025! (deadline Dec 15) If you are interested in solving problems at the intersection of reinforcement learning, imitation learning, and NLP, pls consider applying (bit.ly/4fnficx)! @hseas @KempnerInst
- New paper! Learning to Generate Better Than Your LLM (arxiv.org/abs/2306.11816) RLHF has become a powerful paradigm for fine-tuning LLM, but we only use general-purpose RL algorithms. We introduce new algorithmic paradigm that takes advantage of additional feedback for learning.
- I passed my dissertation defense today - I am officially Dr. Kianté Brantley. Though I officially graduated from @UMD @Clip, I very much consider myself an unofficial graduate and member of the @nyu @CILVRatNYU family. Thank you to those who supported me, including @MSFTResearch
- I am very grateful for the support. Congrats to all the other awardees!From reducing sample complexity in RL to making gig platforms more inclusive for people w/ chronic illness and/or disabilities, the research represented by this year’s Microsoft Research Dissertation Grant recipients is cutting-edge. Learn about the work: aka.ms/AA8qpov
- New #acl2020nlp paper "Active Imitation Learning with Noisy Guidance" We reduce the number of expert annotations needed for imitation learning by incorporating a heuristic function (e.g. gazetteers) using the classic active learning "Apple Tasting" framework.
- What is the "right" embedding space for prediction, reinforcement learning, imitation learning, and planning? We try to tackle this problem in our AAAI paper -- Successor Feature Sets: Generalizing Successor Representations Across Policies (arxiv.org/pdf/2103.02650…)
- The covariate shift problem has been a fundamental issue in imitation learning. We use disagreement among an ensemble of behaviour cloning policies to reduce covariate shift. Joint work with @HenaffMikael and Wen Sun. Paper: bit.ly/3bTZVaL Talk: bit.ly/2SmGQ9o
- In RL, there are many ways to inject knowledge into algorithms in order to make training feasible (e.g. reward shaping/hacking, demonstration data, etc.). However, many key aspects of the desired behavior are more naturally expressed as constraints. (arxiv.org/abs/1906.09323)
- (1/N) New paper! Dataset Reset Policy Optimization for RLHF (arxiv.org/pdf/2404.08495…) RLHF is a popular paradigm for fine-tuning generative models. But the question is, can we design algorithms that take advantage of additional properties of the RLHF framework?
- We are excited to share our new RLHF library - TRIL - which provides tools to train LLM with reinforcement learning, imitation learning, and inverse reinforcement learning algorithms at scale! TRIL:Announcing 📣 an update to our paper "Learning to Search Better than Your LLM" and our new Transformers Reinforcement and Imitation Learning Library (TRIL)! Paper: arxiv.org/abs/2306.11816 Code: github.com/Cornell-RL/tril
- New AAAI! Successor Feature Sets: Generalizing Successor Representations Across Policies — motivation: what is the "right" representation of the world for prediction, imitation, and planning? (In terms of our understanding, rather than efficiency or learnability)(1/N)
- Yisong has some really good tips for CS faculty applications. I used them when I was applying last cycle.Just updated my Tips for CS Faculty Applications. Best of luck to everyone applying! yisongyue.medium.com/checklist-of-t…
- I’m co-organizing Interactive Learning for Natural Language Processing. Please vote for our proposal. Thanks!Please vote for the workshop proposals for EACL/ACL-IJCNLP/EMNLP/ NAACL-HLT 2021 forms.gle/kkfsQZjjs2hFYi… @acl2020 @naacl @allenai_org @uwnlp @ACL_NLP #ACL2020 #naacl #NLP #ACL_NLP -- The EACL/ ACL-IJCNLP / EMNLP / NAACL-HLT 2021 Workshop chairs







