Elevated API Response Times

Resolved

Degraded performance

Lasted for 38m

We're happy that everything is once again ticking along as it usually should.

Our first follow-up action to this incident will be to decrease the 400ms retry timeouts between BigTable clusters. This was the primary source of latency during this incident. And our second follow-up action will be to consider escalating the severity of our API response time alerts to notify engineers earlier.

Sun, Apr 16, 2023, 06:11 PM

•

1 year ago

•

Affected components

Apr 16, 2023, 05:12 PM

05:41 PM

API

Updates

Resolved

We're happy that everything is once again ticking along as it usually should.

Our first follow-up action to this incident will be to decrease the 400ms retry timeouts between BigTable clusters. This was the primary source of latency during this incident. And our second follow-up action will be to consider escalating the severity of our API response time alerts to notify engineers earlier.

Sun, Apr 16, 2023, 06:11 PM

24m earlier...

Monitoring

Five minutes ago we promoted our secondary BigTable cluster to be the primary, and it's happily been chewing through the write operations we had queued up. These queues are now up-to-date and the write response time across API requests has reverted to its usual as of 17:41:30 UTC.

Sun, Apr 16, 2023, 05:46 PM

13m earlier...

Investigating

Our primary BigTable cluster is responding slowly, resulting in elevated API times while we retry to the secondary cluster. The error rate remains low for connections provided your client has timed out waiting for a response.

Sun, Apr 16, 2023, 05:33 PM