Ariel Herbert-Voss (@adversariel) / X

Ariel Herbert-Voss

2,235 posts

Ariel Herbert-Voss

@adversariel

Founder @RunSybil. likes: offsec, LLMs, and dumb memes. prev: research scientist @OpenAI / CS PhD @Harvard / @defcon AI Village

runsybil.com

Joined September 2013

Ariel Herbert-Voss
@adversariel
Oct 23, 2018
4chan mathematicians solved an interesting problem but nobody knows how to cite them. Amazing.
Robin Houston
@robinhouston
Oct 23, 2018
A curious situation. The best known lower bound for the minimal length of superpermutations was proved by an anonymous user of a wiki mainly devoted to anime. mathsci.wikia.com/wiki/The_Haruh…
Ariel Herbert-Voss
@adversariel
Aug 5, 2022
Ariel Herbert-Voss
@adversariel
May 26, 2019
Some news: I’m writing a book for @nostarch titled “The Machine Learning Red Team Manual”. My aim is to provide a practical guide for anyone interested in adversarial ML and red teaming as it relates to in-production ML systems. A short thread on why this project matters:
Ariel Herbert-Voss
@adversariel
Apr 7, 2022
A lion in a hoodie hacking on a laptop - #dalle2
Ariel Herbert-Voss
@adversariel
Apr 24, 2023
There’s a lot of fearmongering about LLMs being capable of finding 0day There are three highly complex roadblocks that need to be overcome for this to be a real concern: statefulness, hallucination, and contamination
81K
Ariel Herbert-Voss
@adversariel
Mar 4, 2022
Oh good I can update my monitor stand
Introduction to Algorithms, Fourth Edition
@clrs4e
Mar 3, 2022
It is here. General release is still scheduled for April 4.
Ariel Herbert-Voss
@adversariel
Feb 24, 2020
Professional news - last month I joined @OpenAI where I am continuing my work on malicious uses of AI and red teaming AI systems :) I’m excited to be working on important problems with such talented people
Ariel Herbert-Voss
@adversariel
Jun 14, 2019
My talk “Don't Red-Team AI Like a Chump” was accepted to @defcon so y’all get ready for some sweet knowledge about attacking ML systems at both the system and algorithm level
GIF
Ariel Herbert-Voss
@adversariel
Mar 25, 2023
the idea that you can just break into a data center and steal the model has a lot of memetic sticking power, but is stupid if you actually know anything about this topic. here's a thread on how confidential computing works in the NVIDIA H100:
87K
Ariel Herbert-Voss
@adversariel
Oct 23, 2018
Replying to @_delta_zero
I used to think it was because as a quasiscientific community we highly value peer review but ML people are totally fine with citing papers on arxiv that haven't been accepted anywhere as long as you can replicate the results - I think perceived prestige nails it
Ariel Herbert-Voss
@adversariel
Dec 16, 2020
If you’re curious about how (potentially sensitive) training data can be extracted out of large public language models like GPT2 then pls give our paper a read
Ariel Herbert-Voss
@adversariel
Jul 27, 2023
made a guide to put parameter size into perspective
12K
Ariel Herbert-Voss
@adversariel
Aug 23, 2020
Sometimes you don’t need fancy math to break ML 😉
cje
@caseyjohnellis
Aug 23, 2020
👀👀👀
Ariel Herbert-Voss
@adversariel
Jul 29, 2019
Check out this meme preview of my @defcon talk :) Come hear me speak in track 1 at 11 AM on Friday the 9th for both spicy takes and fresh AI/ML security advice