🥳🥳
Thrilled to share that I've joined Google DeepMind as a Research Scientist.
Super excited for what's to come!
Michaël Eli Sander
224 posts
Research Scientist at Google DeepMind
- New paper! Sinkformers: Transformers with Doubly Stochastic Attention We use Sinkhorn instead of SoftMax to make attention doubly stochastic. It promotes a democratic principle. With @PierreAblin, @mblondel_ml & @gabrielpeyre Paper (AISTATS 🥳): arxiv.org/abs/2110.11773 1/8
- Very proud of this paper accepted #NeurIPS2022 🥳 "Do ResNets discretize Neural ODEs?" arxiv.org/abs/2205.14612 w. @PierreAblin @gabrielpeyre 🙏 We study the convergence of ResNets to Neural ODEs and train ResNets with a discrete adjoint method. 1/9
- Very happy to share that starting today I'll be a Student Researcher @GoogleAI 🧠 for the next 6 months!
- 🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏
- Space complexity for training Transformers becomes huge as the sequence/batch size increases. These memory requirements can be significantly reduced with a Momentum ResNet version of any Transformer. With momentumnet: github.com/michaelsdr/mom… Tuto: colab.research.google.com/drive/1zAyNz2m… 1/4
- You can now use Momentum ResNets, a drop-in replacement for ResNets with a significantly reduced memory footprint, with a Pytorch API ! pip install momentumnet Website: michaelsdr.github.io/momentumnet/ Code: github.com/michaelsdr/mom… with @PierreAblin, @mblondel_ml and @gabrielpeyre
- 🥳🥳 Proud to see the work done during my internship at Google 🧠 accepted at ICML!! With @joapuipe, Josip Djolonga, @gabrielpeyre and @mblondel_ml Preprint: arxiv.org/abs/2302.01425
- 🚀🚀Check out the final version of our spotlight paper at ICLR 2024 on the convergence of the hidden states of Residual Networks to the solution of a Neural ODE! 🚀🚀 Paper: arxiv.org/abs/2309.01213 Code: github.com/michaelsdr/imp… with @PierreMari0n, Yuhan Wu and @gerardbiauImplicit regularization of deep residual networks towards neural ODEs ift.tt/r7BLTWb
- 🥳🥳 New work: arxiv.org/abs/2309.01213 Implicit Regularization of ResNets towards Neural ODEs w. @PierreMari0n, Yu-Han Wu and @gerardbiau We show: ResNet initialized as discretization of a neural ODE -> such a discretization holds throughout training.
GIF - Tomorrow (Wednesday) at #NeurIPS2022 we'll be presenting our paper "Do ResNets discretize Neural ODEs?", a joint work with @PierreAblin and @gabrielpeyre. Come to Hall J, #642 from 11am to 1pm if you want to know more about this work 😺 Paper: arxiv.org/abs/2205.14612Very proud of this paper accepted #NeurIPS2022 🥳 "Do ResNets discretize Neural ODEs?" arxiv.org/abs/2205.14612 w. @PierreAblin @gabrielpeyre 🙏 We study the convergence of ResNets to Neural ODEs and train ResNets with a discrete adjoint method. 1/9
- I am at ICLR this week :) happy to chat! We will be presenting our spotlight paper with @PierreMari0n Thursday afternoon
- 👋👋🇦🇹🇦🇹 Tomorrow we present our poster at ICLR on Implicit Regularization of Deep ResNets towards Neural ODEs. 👉4:30 pm spot #210 👈 w. @PierreMari0n, YuHan Wu, @gerardbiau x.com/m_e_sander/sta…
🥳🥳 New work: arxiv.org/abs/2309.01213 Implicit Regularization of ResNets towards Neural ODEs w. @PierreMari0n, Yu-Han Wu and @gerardbiau We show: ResNet initialized as discretization of a neural ODE -> such a discretization holds throughout training. - Come and see us today at 1:30 pm at spot #411 for our poster session !!🚨🚨New ICML 2024 Paper: arxiv.org/abs/2402.05787 How do Transformers perform In-Context Autoregressive Learning? We investigate how causal Transformers learn simple autoregressive processes or order 1. with @RGiryes, @btreetaiji, @mblondel_ml and @gabrielpeyre 🙏















