user avatar
Rosanne Liu
@savvyRL
Mom. Cofounded & running @ml_collective. Co-host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS.
San Francisco, CA
Joined March 2013
Posts
  • Pinned
    user avatar
    The opportunity gap in AI is more striking than ever. We talk way too much about those receiving $100M or whatever for their jobs, but not enough those asking for <$1k to present their work. For 3rd year in a row, @ml_collective is raising funds to support @DeepIndaba attendees.
  • user avatar
    A quick thread on "How DALL-E 2, Imagen and Parti Architectures Differ" with breakdown into comparable modules, annotated with size ๐Ÿงต #dalle2 #imagen #parti * figures taken from corresponding papers with slight modification * parts used for training only are greyed out
    A compilation of model architecture diagrams of three recent text-image generative models: DALL-E 2 (or unCLIP), Imagen, Parti.
  • user avatar
    Replying to @savvyRL
    Impostor syndrome definitely implies that one isn't a subset of the other
  • user avatar
    In 2020, my life was turned upside down -- I was encountering failure after failure and in deep depression. But that also turned out to be the most enlightening period. In October I was invited to give a talk at Google (thanks to @orf_bnw). It's out today: youtu.be/0blQp0_9NwY
  • user avatar
    ML conferences should be award shows, with categories like "Best paper" "Best first author" "Best supporting author" "Best Figure 1" "Best appendix" Or creative categories like "I can't believe this worked" "I can't believe this didn't scale" "a theory so beautiful I cried"
  • user avatar
  • user avatar
    70+lbs of GPUs in one suitcase... Wish me luck at the airport tomorrow ๐Ÿฅฒ #NeurIPS2022
  • user avatar
    [PAPER POLICE AT WORK] Pretty cool vis, and solid paper, but do we really have to blow it up as to say LLMs "learn a world model"?? The result basically says all similar words (wrt location or time) are well clustered in the latent spaceโ€”a finding already known from word2vec.
    Do language models have an internal world model? A sense of time? At multiple spatiotemporal scales? In a new paper with @tegmark we provide evidence that they do by finding a literal map of the world inside the activations of Llama-2!
    GIF
  • user avatar
    I am meeting ML researchers all the time. There are two worlds. One world's problem is "should I go to Google or Deepmind next summer" and the other is "a small opportunity, however small, please." All problems are valid. But my heart sinks at this chilling, disorienting divide.
  • user avatar
    Decision tree on โ€œDo I seriously want to get a PhDโ€ by @huanchenzhang via ็ŸฅไนŽ
  • user avatar
    Could it be that RLHF finetuning works not because it's RL, not because it's HF, not even because it's finetuning, but just that it's *rating full sentences* instead of next token? Could it be that in the end, it's just a change of training objectives that made the difference?
  • user avatar
    Over the past year, I grew linearly in age, exponentially in size, stubbornly in wisdom, inversely in AI knowledge, and fluctuatingly in my outlook for the future. Entering a new decade is scary, but slightly less so if you have somewhere to place hope. Happy birthday to me ๐Ÿ˜›
  • user avatar
    Last day of the very special @COLM_conf !! Surprise, surprise, I am actually here to present a poster, than just tweet ๐Ÿ˜† Stop by poster #3 this afternoon if you want to learn about training LMs entirely, and from scratch, on knowledge graphs! Why, How and What we learned.
  • user avatar
    A personal update: I just joined Google Deepmind
    The phenomenal teams from Google Researchโ€™s Brain and @DeepMind have made many of the seminal research advances that underpin modern AI, from Deep RL to Transformers. Now weโ€™re joining forces as a single unit, Google DeepMind, which Iโ€™m thrilled to lead! dpmd.ai/announcing-gooโ€ฆ