We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍
📄 arxiv.org/abs/2510.14086 1/
Wanna know gpt-3.5-turbo's embed size? We find a way to extract info from LLM APIs and estimate gpt-3.5-turbo’s embed size to be 4096. With the same trick we also develop 25x faster logprob extraction, audits for LLM APIs, and more!
📄 arxiv.org/abs/2403.09539
Here’s how 1/🧵
Can a language model help you with your math homework? Not on its own, but maybe with the help of a Python interpreter!
In our EMNLP paper we present 📜 Līla and 🤖 Bhāskara, a math reasoning benchmark and model.
📄: arxiv.org/abs/2210.17517
🔗: lila.apps.allenai.org
1/🧵
🎉 New pre-print 🎉
Teaching models to follow instructions is becoming popular, but what makes Instruction Learning hard in the first place? We investigate with synthetic data and build a challenge dataset!
My first paper with @ai2_aristo at @allen_aiarxiv.org/abs/2204.09148
In a remarkable case of simultaneous discovery, a paper released earlier this week (arxiv.org/abs/2403.06634) also finds that LLM APIs leak information. We are excited for our colleagues and believe that our papers complement and strengthen one another. Amazing work! 8/8
Google presents:
Stealing Part of a Production Language Model
- Extracts the projection matrix of OpenAI’s ada and babbage LMs for <$20
- Confirms that their hidden dim is 1024 and 2048, respectively
- Also recovers the exact hidden dim size of gpt-3.5-turbo
📣 Today at EMNLP I will be giving my first-ever in-person oral presentation!!! Come hear about how we used formal languages to learn what kinds of instructions LMs can follow 🤖
Hall A-B at 2pm
🎉 New pre-print 🎉
Teaching models to follow instructions is becoming popular, but what makes Instruction Learning hard in the first place? We investigate with synthetic data and build a challenge dataset!
My first paper with @ai2_aristo at @allen_aiarxiv.org/abs/2204.09148
7/🧵 Curious about the name choices? Our benchmark is named after Līlavati, a treatise by 12th century Indian mathematician Bhāskara. Read more in our blog post. blog.allenai.org/lila-a-unified…
LLM outputs lie in a low-dimensional vector space (we call this the LLM’s image). By “low-dimensional”, we mean EXACTLY the embed size. To find the embed size we check the dimension of the space that the outputs span.
Ok, but does this have any practical uses? 2/🧵
Can a language model help you with your math homework? Not on its own, but maybe with the help of a Python interpreter!
In our EMNLP paper we present 📜 Līla and 🤖 Bhāskara, a math reasoning benchmark and model.
📄: arxiv.org/abs/2210.17517
🔗: lila.apps.allenai.org
1/🧵
5/🧵 We find that models perform much better when they output a Python program that prints the answer, instead of directly generating the answer. This answer format has the added benefit that it doubles as a step-by-step solution.
It was really fun to contribute to this bit of software, and it has some pretty useful applications! If you want logprobs beyond the top-5 that OpenAI (or any other API) gives you, check out our library 🎉
fun research story about how we jailbroke the the chatGPT API:
so every time you run inference with a language model like GPT-whatever, the model outputs a full probabilities over its entire vocabulary (~50,000 tokens)
but when you use their API, OpenAI hides all this info from