On December 17 at 10:50 AM PST, an internal data task ran repeatedly and caused an outage in one of our core servers. This resulted in an overload in processing jobs, which impacted one of our endpoints tied to logins, and in turn, caused dashboard login issues. By 11:30 AM PST, our team had identified the responsible job and worked to scale the system by pausing a few jobs to restore dashboard availability. However, due to the significant backlog in the queue, emails and notifications were delayed as the system recovered. Throughout the day, our team continued to optimize and scale up, managing the workload until we eventually cleared the backlog and got everything back to normal by 4:30 PM PST.
An internal post-mortem has been organized to analyze the incident and note key learnings, and action items. If you have any concerns or require additional information, please don't hesitate to contact our Customer Support team via hi@envoy.com or contact your dedicated account manager.