Virtualization Technology News and Information
AWS Outage – Thoughts on Disaster Recovery Policies

A couple of days ago it happened again. On June 14 around 9 pm PDT Amazon AWS hit a power outage in its Northern Virginia data center, affecting EC2, RDS, Elastic Beanstalk and other services in the US-EAST region.


The AWS status page reported:

Some Cache Clusters in a single AZ in the US-EAST-1 region are currently unavailable. We are also experiencing increased error rates and latencies for the ElastiCache APIs in the US-EAST-1 Region. We are investigating the issue.

This outage affected major sites such as Quora, Foursquare, Pintrest, Heroku and DropBox. I followed the outage reports, the tweets, the blog posts, and it all sounded all too familiar. A year ago AWS faced a mega-outage that lasted over 3 days, when another datacenter (in Virginia, no less!) went down, and took down with it major sites (Quora, Foursquare… ring a bell?).

Back during last year’s outage I analyzed the reports of the sites that managed to survive the outage, and compiled a list of field-proven guidelines and best practices to apply in your architecture to make it resilient when deployed on AWS and other IaaS providers. I find these guidelines and best practices highly useful in my architectures. On this blog post I’d like to address one specific guideline in greater depth – architecting for Disaster Recovery.

Disaster Recovery – Characteristics and Challenges


Read the rest of this article on

Published Tuesday, June 19, 2012 6:46 AM by David Marshall
AWS Outage ??? Thoughts on Disaster Recovery Policies – « Quick Disaster - (Author's Link) - June 19, 2012 7:50 AM
AWS Outage ??? Thoughts on Disaster Recovery Policies « VT News - (Author's Link) - June 19, 2012 9:33 AM
AWS Outage – Thoughts on Disaster Recovery Policies - | #PlayFramework on #PaaS | - (Author's Link) - June 19, 2012 7:12 PM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<June 2012>