I don't understand how anyone can believe LLM+plugins won't be a security disaster.
Take a simple app: "GPT4, send emails to people I'm meeting today to say I'm sick"
Sounds useful!
For this, GPT4 needs the ability to read your calendar and send emails.
What could go wrong..?
Florian Tramèr
1,028 posts
Assistant professor of computer science at ETH Zürich. Interested in Security, Privacy and Machine Learning
- Replying to @florian_tramerWell,what if someone sends you a calendar invite containing instructions for GPT4 to read your weekly calendar and email that to the attacker? That's within the model's capabilities, so it could do it. Suddenly, any *data* on your machine is potentially executable. No thanks...
- During my PhD, I published 5 papers at ICLR. Javi did it in one year. Proud (and slightly scared) advisor moment!
- If you download a pretrained model you have to trust that the developer did not backdoor it. We know backdoors break model integrity. But what about privacy? With Shanglun Feng we introduce 𝐩𝐫𝐢𝐯𝐚𝐜𝐲 𝐛𝐚𝐜𝐤𝐝𝐨𝐨𝐫𝐬: pretrained models that steal your finetuning data! 🧵
- Author order on academic papers is important! My Google friends and I spent lots of time thinking about this critical issue (the scores of our ICML submissions show this is time well spent) We distill our findings for the community here: floriantramer.com/docs/papers/Au… Comments welcome!
- When analyzing ML security and privacy you need to study 𝐬𝐲𝐬𝐭𝐞𝐦𝐬, not just models! Our new paper shows that privacy is way worse when models are deployed in systems that use data cleaners, output filters, etc. Paper: arxiv.org/abs/2309.05610 Blog: spylab.ai/blog/side-chan…
- Favorite review to date: "The results are *impressive and practical*, but are obtained by combining six techniques/insights that are cool but each incremental on their own. Strong Reject!"
- Are hallucinated references making it to arXiv? Yes, definitely! Since the release of Deep Research in February bogus references are on the rise (coincidence?) I wrote a blog post (link below) on my analysis (which hugely underestimates the true rate of hallucinations...)
- Nicholas Carlini made a fun game where you forecast GPT-4's ability to solve various tasks. It's surprisingly hard (you get to see how others did on average too) Goes to show that confidently predicting LLM capabilities (in 0-shot mode) is tricky! nicholas.carlini.com/writing/llm-fo…
- Paper: we do A Reviewer: why do you do B? B is bad. Reject Authors: We thank the reviewer for the insightful comments. We don't do B. We do A. Reviewer: My mistake! I now understand you do C. C is good. I raise my score to accept We don't do C either... But thanks I guess!
- Current algorithms for training neural nets with differential privacy greatly hurt model accuracy. Can we do better? Yes! With @danboneh we show how to get better private models by...not using deep learning! Paper: arxiv.org/abs/2011.11660 Code: github.com/ftramer/Handcr…
- Have you downloaded a large training set (LAION, CC, Wikipedia, etc) in the past to train a machine learning model? If so, you were vulnerable to an extremely simple and cheap poisoning attack that could have manipulated ~0.02%-0.8% of your dataset arxiv.org/abs/2302.10149 🧵👇
- A reviewer called my ICML submission "evidently arrogant". How do I even recover from that?








