[SYSTEM] Active incident: SLO Budget Burned by a Single Chatty Client
[SYSTEM] Type "help" for a list of investigation commands.
user@bastion:~$
Execute Remediation⚠ PROD
Your API has a 99.9% SLO. Datadog shows your error rate spiked to 5% for the past 3 hours — you've burned 80% of your monthly error budget. The errors are all 429 Too Many Requests responses to a single API key (client_id: ABC-corp-integration). Their integration sends requests in a tight retry loop that ignores 429 responses. How do you protect your SLO going forward?