user avatar
Kianté Brantley
@xkianteb
Assistant Professor at Harvard University @KempnerInst and SEAS | Fitness enthusiast | (He/Him/His)
Joined May 2009
Posts
  • Pinned
    user avatar
    Does LLM RL post-training need to be on-policy?
    00:00
  • user avatar
    I am recruiting PhD students to join my lab at Harvard in Fall 2025! (deadline Dec 15) If you are interested in solving problems at the intersection of reinforcement learning, imitation learning, and NLP, pls consider applying (bit.ly/4fnficx)! @hseas @KempnerInst
  • user avatar
    New paper! Learning to Generate Better Than Your LLM (arxiv.org/abs/2306.11816) RLHF has become a powerful paradigm for fine-tuning LLM, but we only use general-purpose RL algorithms. We introduce new algorithmic paradigm that takes advantage of additional feedback for learning.
  • user avatar
    I passed my dissertation defense today - I am officially Dr. Kianté Brantley. Though I officially graduated from @UMD @Clip, I very much consider myself an unofficial graduate and member of the @nyu @CILVRatNYU family. Thank you to those who supported me, including @MSFTResearch
  • user avatar
    I am very grateful for the support. Congrats to all the other awardees!
    From reducing sample complexity in RL to making gig platforms more inclusive for people w/ chronic illness and/or disabilities, the research represented by this year’s Microsoft Research Dissertation Grant recipients is cutting-edge. Learn about the work: aka.ms/AA8qpov
  • user avatar
    New #acl2020nlp paper "Active Imitation Learning with Noisy Guidance" We reduce the number of expert annotations needed for imitation learning by incorporating a heuristic function (e.g. gazetteers) using the classic active learning "Apple Tasting" framework.
  • user avatar
    What is the "right" embedding space for prediction, reinforcement learning, imitation learning, and planning? We try to tackle this problem in our AAAI paper -- Successor Feature Sets: Generalizing Successor Representations Across Policies (arxiv.org/pdf/2103.02650…)
  • user avatar
    The covariate shift problem has been a fundamental issue in imitation learning. We use disagreement among an ensemble of behaviour cloning policies to reduce covariate shift. Joint work with @HenaffMikael and Wen Sun. Paper: bit.ly/3bTZVaL Talk: bit.ly/2SmGQ9o
  • user avatar
    In RL, there are many ways to inject knowledge into algorithms in order to make training feasible (e.g. reward shaping/hacking, demonstration data, etc.). However, many key aspects of the desired behavior are more naturally expressed as constraints. (arxiv.org/abs/1906.09323)
  • user avatar
    (1/N) New paper! Dataset Reset Policy Optimization for RLHF (arxiv.org/pdf/2404.08495…) RLHF is a popular paradigm for fine-tuning generative models. But the question is, can we design algorithms that take advantage of additional properties of the RLHF framework?
  • user avatar
    We are excited to share our new RLHF library - TRIL - which provides tools to train LLM with reinforcement learning, imitation learning, and inverse reinforcement learning algorithms at scale! TRIL:
    Announcing 📣 an update to our paper "Learning to Search Better than Your LLM" and our new Transformers Reinforcement and Imitation Learning Library (TRIL)! Paper: arxiv.org/abs/2306.11816 Code: github.com/Cornell-RL/tril
  • user avatar
    New AAAI! Successor Feature Sets: Generalizing Successor Representations Across Policies — motivation: what is the "right" representation of the world for prediction, imitation, and planning? (In terms of our understanding, rather than efficiency or learnability)(1/N)
  • user avatar
    Yisong has some really good tips for CS faculty applications. I used them when I was applying last cycle.
    Just updated my Tips for CS Faculty Applications. Best of luck to everyone applying! yisongyue.medium.com/checklist-of-t…
  • user avatar
    I’m co-organizing Interactive Learning for Natural Language Processing. Please vote for our proposal. Thanks!
    Please vote for the workshop proposals for EACL/ACL-IJCNLP/EMNLP/ NAACL-HLT 2021 forms.gle/kkfsQZjjs2hFYi… @acl2020 @naacl @allenai_org @uwnlp @ACL_NLP #ACL2020 #naacl #NLP #ACL_NLP -- The EACL/ ACL-IJCNLP / EMNLP / NAACL-HLT 2021 Workshop chairs