user avatar
Yuqing Du
@d_yuqing
London, UK
Joined June 2020
Posts
  • Pinned
    user avatar
    multimodal in & out! what an honour to have built Omni w/ so many brilliant people 💖
    We’re dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video. It combines Gemini’s intelligence with our generative media systems - representing a leap forward in world understanding, multimodality, and editing 🧵
    00:00
  • user avatar
    a (belated) life update; last november I wrapped up my phd @berkeley_ai. Huge thanks to @pabbeel for all the support & encouragement throughout! 🌟 I've since joined @GoogleDeepMind in London, working on both imagen 3 & veo with an incredible team. more to come!!
  • user avatar
    How can we encourage RL agents to explore human-meaningful behaviors *without* a human in the loop? @OliviaGWatkins2 and I are excited to share “Guiding Pretraining in Reinforcement Learning with LLMs”! 📜arxiv.org/abs/2302.06692 🧵1/
    GIF
  • user avatar
    How can we develop more generalisable reward models for agent behaviours? Excited to share my @deepmind internship project, where we investigate finetuning Flamingo🦩w/ human reward annotations to train success detectors in 3 different domains! 📜arxiv.org/abs/2303.07280 🧵1/
    GIF
  • user avatar
    🥹🥹 All vague-posting aside, super happy this model is finally out there & proud of everyone for making this happen 💖 let us know what you think!
    Introducing Gemini 2.5 Flash Image (aka nano-banana), our SOTA image generation and editing model 🍌 As you might have already seen, this model excels at character consistency, creative edits, and has Gemini's world knowledge!
  • user avatar
    Similarly, 2 years ago we tried to get PPO working for reasoning tasks and struggled to improve over iterative SFT. Agree with these intuitions (esp. better base models + exploration) 👇
    With R1, a lot of people have been asking “how come we didn't discover this 2 years ago?” Well... 2 years ago, I spent 6 months working exactly on this (PG / PPO for math+gsm8k), but my results were nowhere as good. Here’s my take on what blocked me and what’s changed: 🧵
  • user avatar
    Excited to share what we’ve been up to! (:
    Today, we’re announcing Veo 2: our state-of-the-art video generation model which produces realistic, high-quality clips from text or image prompts. 🎥 We’re also releasing an improved version of our text-to-image model, Imagen 3 - available to use in ImageFX through
    00:00
    00:00
    Prompt: An extreme close-up of a craftsperson's hands shaping a glowing piece of pottery on a wheel. Threads of golden, luminous energy connect the potter’s hands to the clay, swirling dynamically with their movements.
    Prompt: A portrait of an Asian woman with neon green lights in the background, shallow depth of field.
  • user avatar
    Current sim2real methods (eg. domain randomization) rely on hand engineering to find a scheme that produces a robust policy without being too conservative. Can we automatically tune a simulator to match reality? Check out our ICRA 2021 paper: yuqingd.github.io/autotuned-sim2…
    00:00
  • user avatar
    Tried a quick lil art battle with the paint transformer. I actually quite like how it made that large stroke of red in the beginning, which peeks through the mountains at the end.
    00:00
  • user avatar
    Humans excel at generating curricula of tasks simply by interacting with other agents. Can RL agents generate similar curricula? We tackle this in our #ICLR2022 paper, Multiagent Selfplay for Automatic Curriculum Generation! w/ @pabbeel, @adityagrover_ 1/8
    GIF
  • user avatar
    Replying to @arena
    Model Strength CI plot: @GoogleDeepMind Imagen 3 is leading with a significant margin.
  • user avatar
    Excited to share some recent work! Assisting people can be hard when it’s challenging to infer their goals. We propose another view: learning to increase human empowerment instead. w/ Stas Tiomkin, @emrek, Daniel Polani, @pabbeel, @ancadianadragan arxiv.org/abs/2006.14796
    00:00
  • user avatar
  • user avatar
    felt inspired to do some drawing again after seeing all the AI art floating around (: