user avatar
Dorsa Sadigh
@DorsaSadigh
CS Faculty @Stanford, @StanfordAILab, @StanfordHAI Research scientist @GoogleDeepMind PhD and BS from @Berkeley_EECS
Palo Alto, CA
Joined February 2014
Posts
  • user avatar
    Here is another uncut video of real-time interactions with @GoogleDeepMind 's Gemini Robotics!
    00:00
  • user avatar
    Congratulations to @ebiyik_ for successfully defending his dissertation, Learning Preferences for Interactive Autonomy! Erdem was my first student, and it has been wonderful working together!
  • user avatar
    Excited to be awarded the AFOSR YIP this year :) Really looking forward to the project and better understanding conventions and trust in repeated human-robot interactions.
    Congratulations to this year’s Young Investigator Research Program (YIP) recipients! 🎉 @AFOSR will award grants to 36 scientists and engineers from 27 research institutions and businesses. @AFResearchLab #BasicResearch #Science #Engineering #AFOSRYIP afrl.af.mil/News/Article-D…
  • user avatar
    So it turns out fine-tuning miserably fails at times often due to the fact that the initial policy is in a different homotopy class than the final policy. We have a new fine-tuning algorithm Ease-In-Ease-Out that enables efficient transfer RL across homotopy classes.
    GIF
  • user avatar
    Congratulations to this awesome team on winning the best paper award at @corl_conf :)
    Congratulations to Annie Xie, @loseydp, Ryan Tolsma, and @DorsaSadigh on the best paper award at @corl_conf! 🎉 It was a privilege to work with such an awesome group. Annie’s talk is here: youtu.be/8bYv3BYmc8s?t=…
  • user avatar
    We solve RLHF by simply using supervised learning - w/o dealing with challenges of RL. Contrastive Preference Learning uses a regret-based preference model which not only accurately captures human preferences but also enables learning preferences in arbitrary MDPs. 🧵👇
    Excited to announced Contrastive Preference Learning (CPL), a simple RL-free method for RLHF that works with arbitrary MDPs and off-policy data. arXiv: arxiv.org/abs/2310.13639 With @rm_rafailov @harshit_sikchi @chelseabfinn @scottniekum W. Brad Knox @DorsaSadigh A thread🧵👇
    GIF
  • user avatar
    Beginning of many future exciting RoboNLP work. Learning adaptive language interfaces through interaction: arxiv.org/abs/2010.05190 w/ @siddkaramcheti and Percy Liang
  • user avatar
    Bay Area Robotics Symposium (BARS) will be happening on Nov 20 this year: bars2020.github.io . We will have a series of faculty and student spotlight talks from Berkeley, UCSC, and Stanford. Also tune in for our keynote by @rodneyabrooks. (1/2)
  • user avatar
    Imagine two robots helping you with a grocery bag. You can physically move the arms one at a time over a sequence of trajectory corrections. We'd like the robots to learn for once that the bag should be placed on the right while not stretching or squeezing the bag too much. (1/2)
    GIF
  • user avatar
    So lucky to have such an awesome group of students. This was such a great surprise :) Thank you #ILIAD folks! @ebiyik94, @MinaeKwon, Mengxi Li, @zhangjie_cao, @siddkaramcheti, @andyshih_ , Suneel Belkhale, Megha Srivastava, Ian Huang, @woodywang153, @suvir_m, Vivek Myers,
  • user avatar
    In large-scale pretraining for robotics -- where massive datasets are lacking -- data quality matters more than ever. We use mutual information estimators to identify high-quality data --optimizing for diverse states & easy-to-fit actions to improve learning policies.
    Behavior cloning... clones behaviors, so naturally data quality directly affects performance. However, there aren't great ways of measuring how "good" or "bad" different demonstrations are. Our recent work seeks to address this problem using estimators of mutual information... 🧵
    GIF
  • user avatar
    Replying to @DorsaSadigh
    Fun fact that the robot and fox are completely OOD and are random toys that I grabbed from Niv's toy box!
  • user avatar
    Congratulations to Hong Jun Jeon and @loseydp for being nominated for best student paper award at #RSS2020 for their work on Shared Autonomy with Learned Latent Actions.
  • user avatar
    Access to internet-scale data is a luxury that robot learning lacks. This raises a need for developing data quality metrics that enable both curating offline datasets and guiding online data collection. (1/8)