Virtualization Technology News and Information
Article
RSS
Why is Amazon Rebooting Around 10% of its EC2 Cloud Servers?
If you or your company is an Amazon Web Services customer, you probably already know that Amazon has embarked on a massive reboot journey of its Elastic Compute Cloud (EC2) instances.  Why?  Speculation has it that they are doing so in order to fix a security bug that was found within the Xen virtualization platform -- the hypervisor technology being used by Amazon -- which could affect the underlying host servers.

The scheduled reboots are expected to affect up to 10% of its cloud server instances all around the world, with an expected completion date by the end of this month.

Amazon privately e-mailed affected customers on Wednesday, and provided additional details about the planned reboot in a company blog post on Thursday where they described what was happening as a "timely security and operational update" that is being made in response to an upcoming Xen security update. 

The post continued, “As we explained in emails to the small percentage of our customers who are affected and on our forums, the instances that need the update require a system restart of the underlying hardware and will be unavailable for a few minutes while the patches are being applied and the host is being rebooted.” 

Amazon stated that after the reboot, each instance would return to normal operation and will retain all data and configurations.

But AWS said it is waiting until next Wednesday, after the Xen security report will be made public so that they can provider a more detailed explanation for the reboot.

"These updates must be completed by October 1st before the issue is made public as part of an upcoming Xen Security Announcement (XSA)," according to the AWS blog.  "Following security best practices, the details of this update are embargoed until then.  The issue in that notice affects many Xen environments, and is not specific to AWS."

According to an FAQ about the reboot by AWS partner RightScale Inc., a company which manages AWS workloads, the reboot started on Sept. 25 at 7:00 PM PDT and will complete on Sept. 30 at 4:59 PM PDT.

A RightScale blog post also recommends EC2 users monitor events within their AWS consoles in order to find the most reliable updates.  "For instances where a short reboot is safe and acceptable, you don’t need to do anything: They will simply reboot during maintenance and stay on the same host with the same ephemeral disks and the same IP address," explains RightScale co-founder Thorsten von Eicken.

But for those customers running databases within their EC2 servers, things can get a bit messy. 

Von Eicken writes:

For databases, if you have set up the recommended master-slave configuration across AZs, you have the option to reboot the impacted AZ ahead of the maintenance window in an attempt to get an instance that is already patched.  If that is not successful, you can failover out of impacted AZs ahead of the maintenance window using the following approach:

  1. Check the AZ of your master and slave.
  2. Check your AWS console “Events” page for the maintenance timeframe for your master and slave AZs.
  3. Clone a new slave DB in a new AZ.
  4. Adjust your master DB and slave DB as appropriate to avoid the maintenance windows and keep a master and slave DB running at all times.
If you do not have a master-slave configuration across AZs and it is critical that you have no downtime of your database, you may want to consider setting up a slave DB in another AZ ahead of the maintenance.

The reboot will not affect T1, T2, M2, R3, and HS1 instances of EC2, according to RightScale.  But the company did caution that other AWS services such as RDS, ElastiCache and RedShift might experience downtime during the reboot period.

Stay tuned.

Published Friday, September 26, 2014 7:21 AM by David Marshall
Filed under: ,
Comments
Why is Amazon Rebooting Around 10% of its EC2 C... - (Author's Link) - September 27, 2014 3:25 AM
Rackspace Joins Amazon in Cloud Reboot Over Xen Hypervisor Bug : @VMblogcom - (Author's Link) - September 29, 2014 7:36 AM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
top25
Calendar
<September 2014>
SuMoTuWeThFrSa
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011