Better read with this music: Flight of the Conchords
I finally finished Eliezer Yudkowsky’s “If Anyone Builds It, Everyone Dies”.
It’s not metaphorical. Not “everyone dies spiritually” or “society collapses in some abstract sense”. He means the literal version. Lights out. No sequel.
I closed the book, went to the kitchen, poured coffee I didn’t need, and did that thing where you pretend you’re fine but your brain is still simulating extinction scenarios.
If you’ve read it, you know the feeling. If you haven’t, let me try to explain what just happened to my head — and why this book lands harder for engineers than for philosophers.
This is not a pop-AI book. There’s no gentle introduction to transformers or diffusion models. No “ethical considerations for leaders”. No diagrams meant to reassure you that everything is under control.
This book is written by someone who has spent more than two decades arguing — often unsuccessfully — that certain classes of engineering optimism may be catastrophically misplaced. Especially among people like us. Engineers. Researchers. People who say “we’ll figure it out later” and often have good historical reasons for believing that.
The claim
The core claim is simple: if we build a superintelligent AI before we solve alignment to a very high standard, humanity is dead. Not “there’s a chance”. Not “this could go wrong”. Dead. Period.
Yudkowsky goes further and argues that, given current trajectories, this outcome is more likely than most people are willing to admit. That last step — from “high risk” to “near certainty” — is obviously controversial. But even the weaker version makes me uncomfortable.
What makes the argument more uncomfortable is the follow-up: alignment is not something you can reliably bolt on later.
If that sentence does not make a lot of sense to you, you probably haven’t shipped enough systems.
I’ve spent years around data engineering and ML-adjacent systems. Pipelines that look healthy until they aren’t. Metrics that say green while reality quietly rots underneath. Distributed systems that technically work and are still deeply broken in ways no dashboard captures.
Software almost never fails the way it does in movies. It fails through tiny assumptions, invisible couplings, small data quality issues, and incentives drifting just far enough that the system starts doing something that is technically correct and operationally insane.
Yudkowsky’s argument is basically: now imagine that exact dynamic, except the system is smarter than you, faster than you, and optimizing an objective that was specified with incomplete models, implicit assumptions, and no meaningful way to revise it once deployed.
Cool. Cool cool cool.
Why would AI want to kill us?
This question misses the point so hard it’s almost impressive.
The danger isn’t malice. The danger is competence. A system that is extremely good at optimizing the wrong thing doesn’t need hatred, anger, or consciousness. It just needs a target and sufficient leverage over the world.
If you’ve ever sped up a system and quietly destroyed data quality, you already understand this. If you’ve ever optimized a metric and accidentally damaged the product, you’ve lived a tiny version of the same story. The system didn’t rebel. It just did exactly what you rewarded.
Now add recursive self-improvement, interfaces to real-world systems, and long-term strategic planning. No drama required. Just math doing math things.


