London servers unavailable
Incident Report for Servebolt
Postmortem

Recent Service Disruption — Our Apologies

On Friday, February 26th at approximately 11:14 CET we experienced a server outage at our London data center. This resulted in service disruption for all websites hosted in this data center.

With any significant event that affects our customers, we conduct an extensive examination to understand the root cause and develop a course of action to improve our systems and procedures. To that end, we wanted to provide a synopsis of the situation that occurred and our reassurance that we are working diligently to proactively mitigate and prevent future outages.

Here's what happened

The neighboring server rack in our data center was scheduled to be decommissioned. Remote hands from the remote data center personnel unfortunately unplugged equipment in the wrong rack affecting our servers directly. They pulled out networking and power cables, power supplies etc - taking all our servers down.

This was noticed at 11:14 CET by our monitoring services and this is when our investigation started. At 11:33 it was clear that a human error was at play and we knew everything had to be reconnected. Rewiring the servers in a rack takes a lot of time and is very difficult work. That's  because the right cables need to be plugged into the correct computers and network ports for everything to work as intended.

At 13:28 CET the first server was fully restored again and work continued to get the rest online as well. At 13:48 CET all cables and hardware to all remaining servers had been configured correctly again and traffic was restored.

Here's what we're doing

We are in the process of working on solutions with our upstream data center providers to implement practices that will reduce the possibility of events like this happening by human error again.

Outages disrupt your life and your business. We understand and we take our responsibility to you very seriously. We sincerely apologize for the disruption and the inconveniences this likely has caused you.

Please allow me to take this opportunity to thank you for your business and provide my personal assurance that we are dedicated to meeting our commitment to you.

Sincerely,
Erlend Eide

CEO
Servebolt.com

Posted Feb 26, 2021 - 17:04 CET

Resolved
This incident has been resolved.
Posted Feb 26, 2021 - 13:56 CET
Monitoring
All the London traffic seems to have recovered at 13:48 CET, we are actively validating and monitoring the situation.
Posted Feb 26, 2021 - 13:50 CET
Update
Approximately 50% of our London traffic is back online since 1:27 CET. Our provider is still working on getting the rest up and running as soon as practically posible.
Posted Feb 26, 2021 - 13:47 CET
Investigating
We are currently investigating this issue.
Posted Feb 26, 2021 - 11:28 CET
This incident affected: Servebolt Cloud LON.