Using Virtualization to Reduce Your RTOs

Imagine Company A, an online retailer, just lost its entire data center due to a chemical fire in an adjoining building. Four months earlier, the IT team transitioned their traditional one-application-to-one-server infrastructure to a set of virtualized machines—reducing hardware replacement costs and more. The IT team also thought ahead, preparing for a worst-case total loss scenario of the data center as part of their disaster recovery planning. They had copied the virtual files (and kept them updated using existing replication procedures) to an off-site recovery location, pre-building and pre-testing the off-site servers in advance. Once disaster struck, the team was able to get all of the mission critical servers back on-line at the recovery site, beating the recovery window expectations set by management, and allowing the company to avoid significant loss of revenue and reputation.

Most business continuity specialists would agree this scenario represents success. There are critical steps that IT managers can take to use virtualization to reduce their recovery times, streamlining the overall recovery process and minimizing the impact of an outage.

Virtualization is one of the hottest buzzwords in the computer industry today. For servers and data centers, it is typically used to reduce the number of servers or help when power or space is at a premium (or running out completely). For disaster recovery professionals, virtualization is particularly promising as a tool to help reduce recovery time objectives (RTOs).

As most business continuity professionals know first-hand, much of the time delay in getting a server back on-line is re-installing the operating system and reloading the application. Using virtualization, the operating system and application can be encapsulated into a single file. This essentially wraps the entire server up into a single file. The single encapsulated file contains all the information required to run that server on another server. This encapsulation concept will allow disaster recovery specialists to greatly reduce the time it takes to bring important, top-priority services or applications back on-line after a disruption.

Following are SunGard’s top ten technical tips for using virtualization to reduce your RTOs:

1. Create the virtualized encapsulation file and copy the file to your disaster recovery site. This may sound like an obvious first-step, but it is often overlooked as companies migrate their production environments to virtualization. At time of disaster (ATOD), run the encapsulated file on your virtualization host in minutes instead of the hours it would take to rebuild from scratch.

2. Encapsulated server files can be used at time of test (ATOT) for testing disruption scenarios. Because testing is easier and faster with virtualized server files, the business continuity team is likely to test more frequently. Repeat testing will help increase proficiency and further reduce RTO based on lessons learned from the tests.

3. Leverage your existing replication methodology to your disaster recovery site to automatically replicate your virtualized encapsulation files to that site. As a follow up to tip #1, it’s important to continually back-up the encapsulated files to an off-site location. At time of disaster, the most up to date encapsulated server file will be available to quickly enable services at the recovery site.

4. The encapsulated server file can be moved to another local or remote data center just by copying the file. This is an extremely powerful concept that can be leveraged for a variety of purposes including enhanced disaster recovery and reduced RTOs.

5. Multiple server files can be stored on a portable hard drive or USB drive and taken to your disaster recovery site. These server files can be used instead of rebuilding the server.

6. Virtual server files can be pre-built and pre-tested prior to a disaster. There are two primary benefits:
- taking this step will help eliminate costly server rebuild mistakes that happen during the rush to get the servers back on-line in the aftermath of an incident; and
- larger enterprises running many, many servers face a daunting task if the primary data center location becomes unavailable or is a total loss—most do not have the personnel resources to build and install servers in time to meet their RTOs. Using virtualization, the IT team can pre-build and test servers in advance, so that they are ready when a disruption occurs.

7. Take advantage of hardware abstraction. With virtual file abstraction, IT departments no longer must abide by the standard ‘like hardware’ rules of traditional server technology—the brand and type of servers in your primary data center and the recovery site do not have to match! During disasters, recovery professionals may waste valuable time searching for the same or similar hardware to replace the failed server hardware. This can save valuable time, helping to meet or beat RTOs.

8. Virtualization provides the ability to run multiple virtual servers on one physical host server. When disaster strikes, instead of installing multiple servers and connecting them to your network, using virtualization you will greatly reduce the number of physical servers that must be brought back on-line. This many-applications-on-one server ability will, not only reduce RTOs, but will reduce the costs and requirements for Fibre Channel and network ports, rack space and power.

9. Use the virtualized server files to quickly move from the recovery site back to the production data center to finalize the recovery. As an additional benefit, the virtualized server files from the disaster recovery site will contain the changes made to the system while at the off-site data center.

10. Don’t count virtualization out for the everyday disruption! For example, when a single server fails, a local backup of a virtualized server file can be used to quickly restore the downed server, helping to improve recovery time to get the services back up and running.

As virtualization becomes more widespread, many companies will begin to experience the additional benefits that this technology brings to recovery efforts—both for the exceptional ‘total data center loss’ event as well as the everyday hardware failure or disruption.

Virtualization is not a panacea for poor business practices—there must be a strong IT and business process foundation including a well-thought out information availability strategy and an often-updated preparedness plan. Using virtualization as part of an overall information availability strategy is a key element to help IT teams meet ever-decreasing recovery windows—which in turn helps to illustrate the team’s overall value and commitment to the success of the business.

By Jack Wade, practice area manager, SunGard Availability Services

