Performance seems to have returned to nominal levels. We are monitoring the situation.
A deployment in our system was erroneously evicted by kubernetes due to a mitigation done earlier in the day. This mitigation avoided an error case caused by a zonal issue in our cloud provider. Unfortunately, this caused an issue with launching a stateful set. The stateful set that had been safely shut down in the offending zone was blocking other instances from starting up. In an unlucky turn of events, this stateful set happened to be first in the list.
We have unblocked instances from spawning and service is be operating normally once again.
Posted May 06, 2021 - 20:21 BST
We believe we know the issue and have put a fix in place. We will keep monitoring and update you on root cause.