user avatar
Alexis Conneau
@alex_conneau
Co-founder and CEO waveforms.ai (@WaveFormsAI) - Ex @OpenAI GPT-4o/AVM Audio Research Lead - #Her #TARS - Ex @AIatMeta, @Polytechnique (X11)
San Francisco
Joined September 2016
Posts
  • Pinned
    user avatar
    Career update: After an amazing journey at @OpenAI building #Her, I’ve decided to start a new company.
  • user avatar
    Update: I joined @OpenAI in San Francisco
  • user avatar
    Just released our new XLM/mBERT pytorch model in 100 languages. Significantly outperforms the TensorFlow mBERT OSS model while trained on the same Wikipedia data. bit.ly/2KItiC4 @GuillaumeLample @Thom_Wolf @PyTorch
  • user avatar
    DATASET RELEASE: "CC100", the CommonCrawl dataset of 2.5TB of clean unsupervised text from 100 languages (used to train XLM-R) is now publicly available. You can find below the Data: data.statmt.org/cc-100/ Script: bit.ly/3oC6aXy By @VishravC et al.
  • user avatar
    👨‍🔬Life update: Happy to share that I recently joined @GoogleAI Language as a research scientist 👨‍🏫 I will continue my research on building neural networks that can learn with little to no supervision
  • user avatar
    Replying to @alex_conneau
    More to come soon. Reach out if you're excited about building something #magical - we're hiring.
    GIF
  • user avatar
    @OpenAI #GPT4o #Audio Extremely excited to share the results of what I've been working on for 2 years GPT models now natively understand audio: you can talk to the Transformer itself! The feeling is hard to describe so I can't wait for people to speak to it #HearTheAGI 🧵1/N
    Introducing GPT-4o, our new model which can reason across text, audio, and video in real time. It's extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction):
    00:00
  • user avatar
    Happy to share our latest paper: "Self-training Improves Pretraining for Natural Language Understanding" We show that self-training is complementary to strong unsupervised pretraining (RoBERTa) on a variety of tasks. Paper: arxiv.org/abs/2010.02194 Code: github.com/facebookresear…
  • user avatar
    New work: "Unsupervised speech recognition" TL;DR: it's possible for a neural network to transcribe speech into text with very strong performance, without being given any labeled data. Paper: ai.facebook.com/research/publi… Blog: ai.facebook.com/blog/wav2vec-u… Code: github.com/pytorch/fairse…
    Today we are announcing our work on building speech recognition models without any labeled data! wav2vec-U rivals some of the best supervised systems from only two years ago. Paper: ai.facebook.com/research/publi… Blog: ai.facebook.com/blog/wav2vec-u… Code: github.com/pytorch/fairse…
  • user avatar
    Excited to announce the creation of WaveForms AI (waveforms.ai) – an Audio LLM company aiming to solve the Speech Turing Test and bring Emotional Intelligence to AI @WaveFormsAI
  • user avatar
    Our new paper: Unsupervised Cross-lingual Representation Learning at Scale arxiv.org/pdf/1911.02116… We release XLM-R, a Transformer MLM trained in 100 langs on 2.5 TB of text data. Double digit gains on XLU benchmarks + strong per-language performance (~XLNet on GLUE). [1/6]
  • user avatar
    [XLSR-53: Multilingual Self-Supervised Speech Transformer] We're happy to release XLSR-53: a wav2vec 2.0 model pre-trained on 56k hours of speech in 53 languages from MLS, CommonVoice and BABEL datasets! Model: github.com/pytorch/fairse… Updated paper: arxiv.org/abs/2006.13979 1/N
  • user avatar
    Career update: A month ago, I re-joined FAIR at @MetaAI as a research scientist. I am continuing my work on self-supervised learning for Language.
  • user avatar
    This video clip should appear at the beginning of any AI movie in the classic flashbacks
    A demo from 1993 of 32-year-old Yann LeCun showing off the world's first convolutional network for text recognition. #tbt #ML #neuralnetworks #CNNs #MachineLearning
    00:00