P0 INCIDENTSLO Budget Burned by a Single Chatty Client#ENG-WAR-049GoogleCloudflare
Mid~20 min15:00
API CPU Usage99.2%↑ 42%
P99 Latency2450 ms↑ 400%
5xx Error Rate12.4%↑ 12%
DB Connections14,492↑ 800%
bastion-prod-1.internal — bash
[SYSTEM] War-Room terminal initialised. Bastion host connection established.
[SYSTEM] Active incident: SLO Budget Burned by a Single Chatty Client
[SYSTEM] Type "help" for a list of investigation commands.
user@bastion:~$
Execute Remediation⚠ PROD
Your API has a 99.9% SLO. Datadog shows your error rate spiked to 5% for the past 3 hours — you've burned 80% of your monthly error budget. The errors are all 429 Too Many Requests responses to a single API key (client_id: ABC-corp-integration). Their integration sends requests in a tight retry loop that ignores 429 responses. How do you protect your SLO going forward?

What is your first action?