Time
2019/10/28 4:30 PM -6:30 PM ET (2 hours)
Issue
During the above time windows, Dynalist’s webpage was unreachable and unable to sync. Users would observe 5xx errors and requests would timeout.
Details
Our hosting provider had a major networking outage.
Below is the official post-mortem from our hosting provider:
OFFICIAL RFO - 10/28/2019
Summary of Incident:
———————————————
Yesterday, Monday October 28th 2019, at approximately 4:23pm portions of customers in our TPA1, TPA2 and DAL1 data centers experienced a loss of network that lasted anywhere from a few minutes to a few hours depending on your server(s) location. The cause of the issue has been identifed and is as follows:
At roughly 4:23pm one of our Network Engineers applied a policy update to our DAL1 edge routers. This policy update was incomplete which led to the full internet routing table being propogated throughout the aggreagation layer of DAL1. This mistake was further exacerbated when that full routing table was automatically injected into the Hivelocity DDoS protection network resulting in the full routing table being distributed to other Hivelocity facilities, i.e. TPA1 and TPA2. The full internet routing table injection led to multiple network devices having their resources exhausted which ultimately led to the network disuption. Once our Network Engineers identified the cause of the issue we began reloading each of the affected network devices to correct the problem. Ultimately, yesterday’s network event was a result of human error.
Service Impact Times:
———————————————
October 28th, 4:23pm - 6:44pm ESTRemediation Plans:
———————————————
We have implemented new router policies that will prevent full route tables being similarly propogated should human error ever occur again. Additionally, we have introduced additional review protocols to minimize the chance of human error occuring.
For years most of our customers have experienced 100% uptime due to our redundancies and nearly 2 decades of experience. We take our responsbility to you very seriously and no one hates it more than us when we fall short of our goals. We are deeply sorry for the inconvenience and any negative impact this disruption had on your operation.