[SYSTEM] Active incident: gRPC Deadline Propagation Causing Cascading Timeout
[SYSTEM] Type "help" for a list of investigation commands.
user@bastion:~$
Execute Remediation⚠ PROD
Service A calls Service B (gRPC, 500ms deadline). Service B calls Service C (no deadline set — default is infinite). Service A times out after 500ms and returns an error to the user. But Service B is still waiting for Service C, which is slow (takes 45 seconds). Service B's connection pool fills up with these zombie goroutines waiting for Service C. After 10 minutes, Service B is completely unresponsive due to goroutine/thread exhaustion.