Amazon EC2 Day 2: Assessing The Damage

April 22, 2011 Off By David
Grazed from Forbes.  Author: Tomio Geron.

The crash of Amazon’s cloud services for web sites, EC2, is now in its second day.

The service went down yesterday at 1:42am PT, shutting down the websites of a number of start-up companies, including Foursquare, Quora and Reddit.

As many web companies, particularly start-ups, pick up the pieces today and try to assess damage, there are no real answers yet for what happened and why. Amazon has various “availability zones” that are in theory supposed to prevent this kind of outage. If one zone goes down, another still works. But in this case that didn’t happen.

A website called EC2disabled.com has a long list of companies that apparently were affected.

Some sites are still down today, while some have come back online, but are missing data. Question and answer website Quora, for example,posted this notification on its site today:

“Data Restoration In Progress: You may notice some data missing from Quora today. In particular, edits to Quora (new questions, edits, upvotes, etc.) made on Wednesday, April 20 won’t appear. When we are able to recover this data, we’ll merge it in with the new changes.”

As of 8:49am PT today, Amazon said it is restoring service, according to a post on its status dashboard:

“8:49 AM PDT We continue to see progress in recovering volumes, and have heard many additional customers confirm that they’re recovering. Our current estimate is that the majority of volumes will be recovered over the next 5 to 6 hours. As we mentioned in our last post, a smaller number of volumes will require a more time consuming process to recover, and we anticipate that those will take longer to recover. We will continue to keep everyone updated as we have additional information.”

Companies taking a major hit were not just individual sites but companies that provide web platforms to other web companies–so-called platform-as-a-service companies, as Gigaom noted.

The companies, such as Heroku, EngineYard and Dotcloud, provide services on top of Amazon EC2 for web developers to more easily launch and maintain websites. But since they depend on Amazon, when Amazon went down, the sites that they service were affected.