Ravelin

Elevated API Error Rate
Resolved
Lasted for 8h

There have been no further bursts of errors this evening, but we will continue to monitor for this behaviour while and after we test our patch.

Tue, Oct 6, 2020, 10:57 PM
3 years ago
Affected components

No components marked as affected

Updates

Resolved

There have been no further bursts of errors this evening, but we will continue to monitor for this behaviour while and after we test our patch.

Tue, Oct 6, 2020, 10:57 PM
4h earlier...

Monitoring

No burst of errors since those reported at 14:45-14:55 BST, but we are continuing to monitor.

Tue, Oct 6, 2020, 06:09 PM
2h earlier...

Investigating

We haven't seen any further bursts of errors since 14:45-14:55 BST but we are continuing to monitor our errors rates and alerts.

In the mean time we have prepared an upgrade to our gRPC connection pooling - discontinuing use of the deprecated gRPC load balancer - which we will be testing and rolling out in the hopes of addressing these errors.

Tue, Oct 6, 2020, 03:23 PM
26m earlier...

Investigating

We have observed several small bursts of 500s on api.ravelin.com and pci.ravelin.com today.

The errors amount to roughly 2% of requests system-wide each time, and are the result of reads and writes to Google BigTable timing out. The impact of timeouts is inconsistent between hosts, and we are keeping an eye out for this issue re-occurring. This incident is a continuation of this morning's incident, which we continue to investigate: https://status.ravelin.com/incidents/xx5sm41vgr1y

Tue, Oct 6, 2020, 02:57 PM
Powered by