This is the reason why I left DeepMind and decided to build an AI security company. I've seen first-hand what RL can do for code generation. Once you treat exploit generation as an RL problem, no software is safe.
the CIA is not ready for the RL era
israeli intelligence guy just hacked into a live surveillance camera in front of me with an exploit generated by qwen
vulnerable software is simulatable.
penetration success is verifiable.
hacking is RLable.






