P0 INCIDENTCascading Failure via Synchronous Calls#ENG-WAR-006AmazonNetflix
Senior~30 min15:00
API CPU Usage99.2%↑ 42%
P99 Latency2450 ms↑ 400%
5xx Error Rate12.4%↑ 12%
DB Connections14,492↑ 800%
bastion-prod-1.internal — bash
[SYSTEM] War-Room terminal initialised. Bastion host connection established.
[SYSTEM] Active incident: Cascading Failure via Synchronous Calls
[SYSTEM] Type "help" for a list of investigation commands.
user@bastion:~$
Execute Remediation⚠ PROD
The non-critical "Recommendations" service slowed down. Suddenly, the critical "Checkout" service crashed. Why did a minor service take down the core business?

What is your first action?