user avatar
Alexandre Défossez
@honualx
Leading ambitious research @kyutai_labs. Chief Science Officer @gradiumai.
Paris, France
Joined March 2019
Posts
  • user avatar
    Meet Moshiko and Moshika, the open source Moshi models 📖🟢. Moshi is a 7B text-audio model, capable of doing full-duplex conversations: it can listen and speak at any time. Plus, its inner text monologue improves the generation 💬 All on device🧑‍💻 🔎kyutai.org/Moshi.pdf
    00:00
    Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in Pytorch, Rust and MLX. More details below 🧵 ⬇️ Paper: kyutai.org/Moshi.pdf Repo:
  • user avatar
    I'm happy to release the v3 of Demucs for Music Source Separation, with hybrid domain prediction, compressed residual branches and much more. Checkout the code: github.com/facebookresear… Here is a demo for you @jaimealtozano, I'm sure you'll enjoy the improvements!
    00:00
  • user avatar
    I recently discovered Perlin noise, a stochastic texture generation algorithm used to make realistic fire, smoke, clouds etc. It was developed by Ken Perlin for the CGI of Disney movie Tron in 1982 🤖 (1/N)
  • user avatar
    We have released our platform for source separation in music. We adapt Conv-Tasnet and introduce the Demucs architecture, leading to two state-of-the-art models surpassing all previously known methods such as Wave-U-Net, Open-Unmix or Spleeter.
  • user avatar
    AI is nothing without open source, #keepaiopen 🤗
  • user avatar
    We release stereo models for all MusicGen variants (+ a new large melody both mono and stereo): 6 new models available on HuggingFace (thanks @reach_vb). We show how a simple fine tuning procedure with codebook interleaving takes us from boring mono to immersive stereo🎧👇
    00:00
  • user avatar
    I have extended Julius with some extra features: FFT convolutions, FIR filters and decomposition over frequency bands in the waveform domain. All in @PyTorch, differentiable and with CUDA and TorchScript support.
  • user avatar
    With @jadecopet, @syhw and @adiyossLC , we are releasing EnCodec, a state-of-the-art neural audio codec supporting both 24 kHz mono audio and 48 kHz stereo, with bandwidth ranging from 1.5 kbps to 24 kbps 🗜️🎤🤖 arxiv.org/pdf/2210.13438…
    00:00
  • user avatar
    Really excited to be part of the founding team of @kyutai_labs: at the heart of our mission is doing open source and open science in AI🔬📖. Thanks so much to our founding donators for making this happen 🇪🇺 I’m thrilled to get to work with such a talented team and grow the lab 😊
    Announcing Kyutai: a non-profit AI lab dedicated to open science. Thanks to Xavier Niel (@GroupeIliad), Rodolphe Saadé (@cmacgm) and Eric Schmidt (@SchmidtFutures ), we are starting with almost 300M€ of philanthropic support. Meet the team ⬇️
  • user avatar
    Today we release MusicGen, a text-to-music auto-regressive model built on EnCodec. It also supports optional melody conditioning based on chroma-gram extraction! It requires only 50 autoregressive steps per second of audio. Really fun to remix known tune in all genre 👇 + 🧵
    We present MusicGen: A simple and controllable music generation model. MusicGen can be prompted by both text and melody. We release code (MIT) and models (CC-BY NC) for open research, reproducibility, and for the music community: github.com/facebookresear…
    00:00
  • user avatar
    As a PhD student and RS, FAIR was a magical place to be in: - incredible mentoring in all fields of AI🧑‍🏫 - access to resources and having my own research agenda 🧭 - free and encouraged to publish and open source 📖 For a lot of us there it was a transformative experience 🧑🏻‍🚀
    Replying to @ylecun
    Meta has definitely been the best thing to happen to AI.
  • user avatar
    We are releasing the code for our Interspeech paper "Real Time Enhancement in the Waveform Domain" with @syhw and @adiyossLC . Watch our live demo youtu.be/77cm_MVtLfk. Want to try it? Checkout our repo github.com/facebookresear… (1/2)
  • user avatar
    Official MusicGen now also supports extended generation (different implem, same idea). Go to our colab to test it. And keep an eye on @camenduru for more cool stuff! Of course, I tested it with an Interstellar deep remix as lo-fi with organic samples :) colab.research.google.com/drive/1fxGqfg9…
    00:00
    01:30
    Good news 🥳 Now we can generate more than 30s, Thanks to rkfg ❤ and Oncorporation ❤ github.com/rkfg/audiocraf… github.com/Oncorporation/… Please try it 🐣 github.com/camenduru/Musi… 🦆 🖼 stable diffusion model Freedom Redmond by @artificialguybr
  • user avatar
    We do not have a demo booth at #NeurIPS2023 but the MusicGen demo is always online 💻 and all code is open source 📖, with @jadecopet and @FelixKreuk 🎶🥁 huggingface.co/spaces/faceboo…