Guillaume Lample @ NeurIPS 2024 (@GuillaumeLample) / X

Guillaume Lample @ NeurIPS 2024

614 posts

Guillaume Lample @ NeurIPS 2024

@GuillaumeLample

Cofounder & Chief Scientist Mistral.ai (@MistralAI). Working on LLMs. Ex @MetaAI | PhD @Sorbonne_Univ_ | MSc @CarnegieMellon | X11 @Polytechnique

Paris, France

Joined December 2016

Pinned
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Dec 11, 2023
Very excited to release our second model, Mixtral 8x7B, an open weight mixture of experts model. Mixtral matches or outperforms Llama 2 70B and GPT3.5 on most benchmarks, and has the inference speed of a 12B dense model. It supports a context length of 32k tokens. (1/n)
Mistral AI
@MistralAI
Dec 8, 2023
magnet:?xt=urn:btih:5546272da9065eddeb6fcd7ffddeef5b75be79a7&dn=mixtral-8x7b-32kseqlen&tr=udp%3A%2F%2Fopentracker.i2p.rocks%3A6969%2Fannounce&tr=http%3A%2F%https://t.co/g0m9cEUz0T%3A80%2Fannounce RELEASE a6bbd9affe0c2725c1b7410d66833e24
2.2M
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Feb 24, 2023
Today we release LLaMA, 4 foundation models ranging from 7B to 65B parameters. LLaMA-13B outperforms OPT and GPT-3 175B on most benchmarks. LLaMA-65B is competitive with Chinchilla 70B and PaLM 540B. The weights for all models are open and available at research.facebook.com/publications/l… 1/n
3.2M
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Feb 26, 2024
Today, we are releasing Mistral Large, our latest model. Mistral Large is vastly superior to Mistral Medium, handles 32k tokens of context, and is natively fluent in English, French, Spanish, German, and Italian. We have also updated Mistral Small on our API to a model that is
865K
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jun 8, 2020
Unsupervised Translation of Programming Languages. Feed a model with Python, C++, and Java source code from GitHub, and it automatically learns to translate between the 3 languages in a fully unsupervised way. arxiv.org/pdf/2006.03511… with @MaLachaux @b_roziere @LowikChanussot
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Sep 27, 2023
Mistral 7B is out. It outperforms Llama 2 13B on every benchmark we tried. It is also superior to LLaMA 1 34B in code, math, and reasoning, and is released under the Apache 2.0 licence. mistral.ai/news/announcin…
Mistral AI
@MistralAI
Sep 27, 2023
magnet:?xt=urn:btih:208b101a0f51514ecf285885a8b0f6fb1a1e4d7d&dn=mistral-7B-v0.1&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=https%3A%2F%https://t.co/HAadNvH1t0%3A443%2Fannounce RELEASE ab979f50d7d406ab8d0b07d09806c72c
1.2M
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jul 24, 2024
Today, we release Mistral Large 2, the new version of our largest model. Mistral Large 2 is a 123B-parameter model with a 128k context window. On many benchmarks (notably in code generation and math), it is superior or on par with Llama 3.1 405B. Like Mistral NeMo, it was trained
Mistral AI
@MistralAI
Jul 24, 2024
mistral.ai/news/mistral-l…
535K
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Dec 4, 2019
Our new paper, Deep Learning for Symbolic Mathematics, is now on arXiv arxiv.org/abs/1912.01412 We added *a lot* of new results compared to the original submission. With @f_charton (1/7)
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jun 14, 2023
Life update: I recently left Meta, and we are starting Mistral.AI, a new AI company with @arthurmensch and @tlacroix6
Frontier AI LLMs, assistants, agents, services | Mistral
From mistral.ai
336K
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
May 29, 2024
Today we are releasing Codestral-22B, our first code model! Codestral is trained on more than 80 programming languages and outperforms the performance of previous code models, including the largest ones. It is available on our API platform, through instruct and
179K
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jul 29, 2020
Code is now available online with pretrained models!
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jun 8, 2020
Unsupervised Translation of Programming Languages. Feed a model with Python, C++, and Java source code from GitHub, and it automatically learns to translate between the 3 languages in a fully unsupervised way. arxiv.org/pdf/2006.03511… with @MaLachaux @b_roziere @LowikChanussot
GitHub - facebookresearch/TransCoder: Public release of the TransCoder research project https://a...
From github.com
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
May 24, 2022
Excited to release our latest work: arxiv.org/abs/2205.11491 We present a new algorithm, HyperTree Proof Search (HTPS) inspired by the recent success of AlphaZero. Our model is able to prove mathematical theorems in a fully automated way and significantly outperforms the SOTA. 1/n
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Feb 16, 2021
New paper on code de-obfuscation: arxiv.org/abs/2102.07492 We show that if you obfuscate the name of identifiers in source code, a model can retrieve the original names with very high accuracy. It even works when you remove the name of each variable / function! 1/3
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jun 21, 2019
If you want to train BERT from scratch in @PyTorch, you can check out our XLM repository! Our English model outperforms the original BERT on all GLUE tasks, although it's trained on the same data and without the next sentence prediction task github.com/facebookresear… @alex_conneau
GitHub - facebookresearch/XLM: PyTorch original implementation of Cross-lingual Language Model...
From github.com
Guillaume Lample @ NeurIPS 2024
@GuillaumeLample
Jul 12, 2019
Our new paper: Large Memory Layers with Product Keys arxiv.org/abs/1907.05242 We created a key-value memory layer that can increase model capacity for a negligible computational cost. A 12-layer transformer with a memory outperforms a 24-layer transformer, and is 2x faster! 1/2