Michaël Eli Sander (@m_e

Michaël Eli Sander

224 posts

Michaël Eli Sander

@m_e_sander

Research Scientist at Google DeepMind

Paris

Joined February 2021

Michaël Eli Sander
@m_e_sander
Oct 7, 2024
🥳🥳 Thrilled to share that I've joined Google DeepMind as a Research Scientist. Super excited for what's to come!
200K
Michaël Eli Sander
@m_e_sander
Jan 25, 2022
New paper! Sinkformers: Transformers with Doubly Stochastic Attention We use Sinkhorn instead of SoftMax to make attention doubly stochastic. It promotes a democratic principle. With @PierreAblin, @mblondel_ml & @gabrielpeyre Paper (AISTATS 🥳): arxiv.org/abs/2110.11773 1/8
Michaël Eli Sander
@m_e_sander
Sep 15, 2022
Very proud of this paper accepted #NeurIPS2022 🥳 "Do ResNets discretize Neural ODEs?" arxiv.org/abs/2205.14612 w. @PierreAblin @gabrielpeyre 🙏 We study the convergence of ResNets to Neural ODEs and train ResNets with a discrete adjoint method. 1/9
Michaël Eli Sander
@m_e_sander
Sep 19, 2022
Very happy to share that starting today I'll be a Student Researcher @GoogleAI 🧠 for the next 6 months!
Michaël Eli Sander
@m_e_sander
Jul 22, 2024
🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏
38K
Michaël Eli Sander
@m_e_sander
Nov 22, 2021
Space complexity for training Transformers becomes huge as the sequence/batch size increases. These memory requirements can be significantly reduced with a Momentum ResNet version of any Transformer. With momentumnet: github.com/michaelsdr/mom… Tuto: colab.research.google.com/drive/1zAyNz2m… 1/4
Michaël Eli Sander
@m_e_sander
Jul 20, 2021
You can now use Momentum ResNets, a drop-in replacement for ResNets with a significantly reduced memory footprint, with a Pytorch API ! pip install momentumnet Website: michaelsdr.github.io/momentumnet/ Code: github.com/michaelsdr/mom… with @PierreAblin, @mblondel_ml and @gabrielpeyre
Michaël Eli Sander
@m_e_sander
Apr 25, 2023
🥳🥳 Proud to see the work done during my internship at Google 🧠 accepted at ICML!! With @joapuipe, Josip Djolonga, @gabrielpeyre and @mblondel_ml Preprint: arxiv.org/abs/2302.01425
22K
Michaël Eli Sander
@m_e_sander
Mar 5, 2024
🚀🚀Check out the final version of our spotlight paper at ICLR 2024 on the convergence of the hidden states of Residual Networks to the solution of a Neural ODE! 🚀🚀 Paper: arxiv.org/abs/2309.01213 Code: github.com/michaelsdr/imp… with @PierreMari0n, Yuhan Wu and @gerardbiau
Stat.ML Papers
@StatMLPapers
Mar 4, 2024
Implicit regularization of deep residual networks towards neural ODEs ift.tt/r7BLTWb
19K
Michaël Eli Sander
@m_e_sander
Sep 7, 2023
🥳🥳 New work: arxiv.org/abs/2309.01213 Implicit Regularization of ResNets towards Neural ODEs w. @PierreMari0n, Yu-Han Wu and @gerardbiau We show: ResNet initialized as discretization of a neural ODE -> such a discretization holds throughout training.
GIF
27K
Michaël Eli Sander
@m_e_sander
Nov 30, 2022
Tomorrow (Wednesday) at #NeurIPS2022 we'll be presenting our paper "Do ResNets discretize Neural ODEs?", a joint work with @PierreAblin and @gabrielpeyre. Come to Hall J, #642 from 11am to 1pm if you want to know more about this work 😺 Paper: arxiv.org/abs/2205.14612
Michaël Eli Sander
@m_e_sander
Sep 15, 2022
Very proud of this paper accepted #NeurIPS2022 🥳 "Do ResNets discretize Neural ODEs?" arxiv.org/abs/2205.14612 w. @PierreAblin @gabrielpeyre 🙏 We study the convergence of ResNets to Neural ODEs and train ResNets with a discrete adjoint method. 1/9
Michaël Eli Sander
@m_e_sander
May 7, 2024
I am at ICLR this week :) happy to chat! We will be presenting our spotlight paper with @PierreMari0n Thursday afternoon
8.7K
Michaël Eli Sander
@m_e_sander
May 8, 2024
👋👋🇦🇹🇦🇹 Tomorrow we present our poster at ICLR on Implicit Regularization of Deep ResNets towards Neural ODEs. 👉4:30 pm spot #210 👈 w. @PierreMari0n, YuHan Wu, @gerardbiau x.com/m_e_sander/sta…
Michaël Eli Sander
@m_e_sander
Sep 7, 2023
🥳🥳 New work: arxiv.org/abs/2309.01213 Implicit Regularization of ResNets towards Neural ODEs w. @PierreMari0n, Yu-Han Wu and @gerardbiau We show: ResNet initialized as discretization of a neural ODE -> such a discretization holds throughout training.
7K
Michaël Eli Sander
@m_e_sander
Jul 25, 2024
Come and see us today at 1:30 pm at spot #411 for our poster session !!
Michaël Eli Sander
@m_e_sander
Jul 22, 2024
🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏
15K