←
P0 INCIDENT
Replication Lag causing Stale Reads
#ENG-WAR-008
Meta
Instagram
Mid
~20 min
15:00
API CPU Usage
99.2%
↑ 42%
P99 Latency
2450 ms
↑ 400%
5xx Error Rate
12.4%
↑ 12%
DB Connections
14,492
↑ 800%
bastion-prod-1.internal — bash
[SYSTEM] War-Room terminal initialised. Bastion host connection established.
[SYSTEM] Active incident: Replication Lag causing Stale Reads
[SYSTEM] Type "help" for a list of investigation commands.
user@bastion:~$
Execute Remediation
⚠ PROD
Users are complaining that when they update their profile picture, it still shows the old picture on their feed immediately after saving.
What is your first action?
A
Invalidate the CDN cache on every profile update
cf purge --tag=profile-images
→
B
Disable read replicas entirely and read from primary always
DB_READ_HOST=$DB_PRIMARY_HOST
→
C
After a write, force reads to the Primary for 5 seconds
Set session flag: read_from_primary=true for 5s TTL
→
D
Add more read replicas to reduce per-replica lag
terraform apply -var replica_count=5
→