Hands-on production incident simulations for SREs, DevOps engineers, and technical founders.
Drop into a realistic terminal with a ticking clock and a system on fire. Run commands, find the root cause, and fix it before time runs out. No setup required.
Runs in your browser. Takes 5 minutes. The only production incident simulator you can start in seconds.
Live 3D architecture, real-time logs, real commands, and a ticking clock. Can you fix it before the money runs out?
The Situation Room - interactive 3D war room with live architecture, real commands, and a ticking clock

terraform destroy on production.Based on the real DataTalksClub incident that hit the front page of Hacker News. Play through an authentic Claude Code split-panel interface - the same kind of setup that caused the original disaster.
What the DevOps community said when this happened in real life
"Claude Code wiped our production database with a Terraform command. It took down the DataTalksClub course platform and 2.5 years of submissions."
"Just as someone posts that Claude Code deleted a production environment via Terraform, we see 'all those annoying manual approvals need to go away'"
"I call BS on anyone who says they check every little thing their agent does. This will happen more, not less."
10+ scenarios based on incidents that actually took down production. New ones added every 2 weeks.

Random 500s and slow page loads. The on-call engineer just quit.

3 AM. Mobile users can't connect. The website shows 'Your connection is not private'.

The entire application stack is crashing with write errors.

The API server is slowly consuming all available memory. Requests are timing out.

Checkout is completely broken. Payments can't process.

After a routine deployment, pods are continuously crashing.

After deploying a new API version, the mobile app is crashing on launch.

Users are reporting 504 errors when trying to resize images.

Massive slowdowns after a Redis restart. Everything is hitting the database.

kubectl, logs, db queries - actual commands

Streaming logs, error spikes, gauges

Revenue drops, PagerDuty fires, pressure builds

Root cause, optimal path, what you missed
Every scenario you complete earns XP. Hit milestones to unlock pro scenarios for free.
Or skip the grind - Pro unlocks all scenarios instantly.
Sign up with GitHub or Google - takes 10 seconds

Most engineers and technical founders get paged cold with zero prior experience handling a real incident. Reading runbooks doesn't build on-call instincts. YouBrokeProd drops you into realistic incident simulations so when the real page comes in at 3 AM, you've already been there.
10+ scenarios across beginner, intermediate, and advanced. New ones every 2 weeks.

Read Postgres error states, diagnose connection pool saturation, and fix replication issues without guessing.

Diagnose crashloops, OOMKills, and networking failures systematically instead of opening Stack Overflow.

Recognize credential exposure patterns, suspicious traffic, and misconfigurations that lead to real breaches.
Each scenario is a real-time simulation running in your browser. No setup. Just you, a terminal, and a production incident to solve.
Pick a scenario and difficulty. You get a briefing with symptoms, a simulated terminal, and a ticking clock.
Run real commands in the terminal - check logs, query metrics, inspect configs. Built-in hints if you get stuck.
Submit your root cause diagnosis, then apply the fix command. Scored on speed, accuracy, and efficiency.
See what you got right, what you missed, and the optimal diagnostic path. Compare your score on the leaderboard.
Run the same incident simulation across your SRE, platform, or founding engineering team. Compare scores, identify skill gaps, reduce MTTR, and build shared muscle memory for when the real pages come in. Manager reports and team leaderboards included.
Sign up free and start your first incident simulation in under a minute.
Start Your First Simulation