user avatar
Vaishaal Shankar
@Vaishaal
Birth date Add your date of birth Switch to professional
Berkeley, CA
Joined August 2008
Posts
  • user avatar
    Finally get to try the new bart trains! I've been literally waiting months for this. Will live tweet the experience.
  • user avatar
    We have released our DCLM models on huggingface! To our knowledge these are by far the best performing truly open-source models (open data, open weight models, open training code) 1/5
  • user avatar
    I am really excited to introduce DataComp for Language Models (DCLM), our new testbed for controlled dataset experiments aimed at improving language models. 1/x
  • user avatar
    Replying to @Vaishaal
    Looks like the good-ol reboot trick didn't work. Now we are going to exit the train! In the tunnel! This is the most exciting commute I've had
  • user avatar
    Replying to @Vaishaal
    They are trying to fix the train by turning the computer off and on again. I am sure this will work.
  • user avatar
    Replying to @Vaishaal
    The bart technician says this is all because "all this technology is too new" [points to train].
  • user avatar
    Replying to @Vaishaal
    Made it outside. Thanks @SFBART workers for handling the situation and getting everybody out safely!
  • user avatar
    I had an argument with @PreetumNakkiran about MLPs 4 years ago. He said with enough data + compute the MLP/ConvNet gap would go to 0. I was convolution-pilled and convinced this wasn't possible. He was right:
  • user avatar
    Replying to @hankgreen
    It's a multiple choice exam that covers ~57 subjects. It's generally a good benchmark for capabilities of a model. 90% just means the model got 90% of these questions right. The paper is not a terrible read:
  • user avatar
    Replying to @Vaishaal
    these trains are so much quieter I can hear the sound of my own thoughts
  • user avatar
    Neural Kernels Without Tangents arxiv.org/abs/2003.02237 Joint work with Alex Fang, @WSguo, Sara Fridovich-Keil, @lschmidt3, @jrk and @beenwrekt Taking inspiration from convolutional networks, we construct high performance kernel functions for image classification (1/6)
  • user avatar
    Replying to @Vaishaal
    Well we evacuated the train successfully! Turns out we were only few hundred yards from 12 street bart
    00:00
  • user avatar
    Replying to @Vaishaal
    Welp so much for that honeymoon phase. Looks like the train stopped abruptly in the tunnel before 12th street.
  • user avatar
    Replying to @Vaishaal
    They have succesfully dimmed the lights and turned off the AC.