GLADIA Research Lab (@GladiaLab) / X

GLADIA Research Lab

66 posts

GLADIA Research Lab

@GladiaLab

Based in Rome, GLADIA is a team of computer scientists, physicists, engineers and mathematicians venturing beyond the boundaries of machine intelligence

Rome

Joined May 2025

Pinned
GLADIA Research Lab
@GladiaLab
Oct 25, 2025
Introducing the GLADIA Research Lab.
53K
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)
5.1M
GLADIA Research Lab
@GladiaLab
Oct 30, 2025
After reading many of the replies, we would like to issue a few clarifications: - we cannot extract training data from the model using our method - LLMs are not injective w.r.t. the output text, that function is definitely non-injective and collisions occur all the time -
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)
191K
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
Replying to @GladiaLab
Language models are structurally lossless: - Hidden states do not compress or abstract the prompt; - Any system storing them effectively stores the input text itself; - This impacts privacy, deletion, and compliance: once data enters a Transformer, it remains recoverable. (5/6)
137K
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
Replying to @GladiaLab
Injectivity is not accidental, but a structural property of language models! We show that: • Transformers are real-analytic by composition • At initialization, collisions occur with probability zero • Gradient descent preserves this property throughout training (2/6)
173K
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
Replying to @GladiaLab
But what can we do with injectivity? Well, for one, we can invert language models! We introduce SipIt, an algorithm that exactly reconstructs the input from hidden states in guaranteed linear time. SipIt recovers inputs >100× faster than alternatives, while remaining exact.
119K
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
Replying to @GladiaLab
We back our theory with an extensive empirical confirmation. Across billions of prompt pairs and several model sizes, we find no collisions: no two prompts are mapped to the same hidden states! (3/6)
125K
GLADIA Research Lab
@GladiaLab
Oct 27, 2025
Replying to @GladiaLab
Preprint: arxiv.org/abs/2510.15511 Joint work w/ @GiorgosNik02 @tommaso_mncttn @DonatoCrisosto1 @teelinsan Yannis Panagakis @EmanueleRodola stay tuned! (6/6)
arxiv.org
Language Models are Injective and Hence Invertible
Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of...
105K
GLADIA Research Lab
@GladiaLab
Oct 28, 2025
Replying to @fs9h7kh4b5
real
57K
GLADIA Research Lab
@GladiaLab
Oct 30, 2025
Replying to @theowwrld
it's over
6.5K
GLADIA Research Lab
@GladiaLab
Oct 30, 2025
Replying to @crystalmask
no
4.4K
GLADIA Research Lab
@GladiaLab
Oct 25, 2025
Replying to @GladiaLab
But also, bold frontier ideas, like @tensorqt's series "The graph side of Attention". The series opens with a post explaining attention sinks as a bias in causal Transformers:
3.5K
GLADIA Research Lab
@GladiaLab
Oct 30, 2025
Replying to @larrytheliquid
@Pringles
645
GLADIA Research Lab
@GladiaLab
Oct 25, 2025
Replying to @GladiaLab
We will use this page to popularize our research and deliver tailor-made blogposts, outlining our vision for the future of Machine Learning. Welcome to GLADIA. More on us: gladia.netlify.app Our blog: gladia-research-group.github.io/blog/
2.3K