Downtime post-mortem: June 21st

By Edouard on June 21, 2011

At 07:11 AM GMT today Web Translate It wasn’t responding. I was immediately notified, woke up and noticed the server wasn’t reachable by any means.

I immediately reported back on Twitter and contacted our hosting company, DigitalOne in order to know if they were having issues.

The word from the hosting company was that the service disruption was due to a bug one of their core routers and they appeared to be working on it. In the meanwhile, I noticed that services such as Pinboard and Instapaper —which are both partially or fully hosted at the same ISP— were also having network issues.

Since the service wasn’t completely down, but continuously coming on and off-line, I decided against moving the service to the backup server, which is much weaker than the main server.

Around 10:30 AM GMT the service became online again and has been stable throughout the day. We suffered no data loss.

Right now the word is that the FBI raided our hosting company’s datacenter and pulled three racks of servers. Some servers were removed, others (such as Web Translate It’s) were temporarily disconnected.

I am really sorry for any and all problems this has caused. In the coming months I will take actions to make Web Translate It able to fail over more easily in such extraordinary events.