Virtualization Technology News and Information
Disaster Recovery in the Multi-Cloud Era

By Faiz Khan, Founder/CEO, Wanclouds

There are two main approaches to creating disaster recovery for your cloud infrastructure hosting your production application.

  1. The first approach is to have a replica of the primary production infrastructure such as a VPC setup with network functions, security policies and nodes, storage etc. hosting your application. The replica can be in the same or different region with active data synchronization running. A DNS switch-over (as an example) may be used to switch to the backup site in case the primary site is unavailable for whatever reason. While this is an ideal scenario, it can be very costly and paying for the replicated infrastructure which for the most part is just sitting idle.
  2. The second approach is to use an on-demand setup/restore of your primary production environment. This approach is cost effective, especially if your application can tolerate a few minutes to an hour type delay then you can save a lot of costs avoiding unnecessary spend on infrastructure which may be sitting idle for most of the time.

In this article, we are focusing on the second scenario where you want to have an on-demand deployment or restoration of your production environment. This sometimes is also referred to as the Cold DR approach. There are various backup solutions available for backing up virtual machines and more recently solutions for backing up kubernetes, data etc.. These solutions typically work well if the infrastructure setup such as a Virtual Private Cloud setup, network functions and their configurations haven't changed at the time of restoring on the same setup. However, in a disaster scenario such as a particular public cloud region going down, or the VPC setup itself getting messed up due to human error or other reasons, the restoration of VMs, Kubernetes etc. becomes a lengthy process as the VPC design, network functions, and security policies have to be setup first and then restoring Kubernetes, VMs etc.

In addition, imagine a scenario if the entire cloud experiences an outage in one or multiple regions and you decide to move and restore your application, infrastructure, network functions etc. in a different cloud. This unfortunately, becomes a daunting task and most likely customers will not consider this option specially in a disaster scenario given it may take weeks and cost a lot as the process itself will require a lot of engineering effort and resources.

For a comprehensive disaster recovery or business continuity use-cases you should be prepared or have the ability to restore your entire cloud infrastructure setup in minutes not in days and weeks. Following is an example of resources that needs to be ready to be restored in a different region or different cloud in a matter of minutes:

  • VPC Setup
    • VPC Zones construct
    • IP addressing
    • Subnets
    • Load Balancing
    • Routes
    • Security Groups
    • Access Control Lists
    • Virtual Private Network gateway and setup
    • Public Gateway
    • Network Address Translation
    • Content Delivery Network Information
    • SSH Keys
    • Names and Tags
    • Policies
    • Virtual Machines
    • Storage volumes
    • File storage bucket information
    • DNS
  • Kubernetes or OpenShift:
    • Kubernetes Manifest file
    • Persistent Volumes
    • Applications and Name spaces
    • Other Kubernetes setup related configurations such as PODs, Deployments, Services etc.
    • Moving across storage classes

Ensure that the above resources and their relationships with each other are understood and can be restored during disaster scenarios across regions and across clouds.

Ensure you can move your data cross storage classes migration for any persistent.

Multiple tools such as Terraform, Ansible along with VM and Kubernetes DR tools can be used to create a comprehensive DR solution with a savvy cloud operation team. Companies like Wanclouds Inc. are also focusing on simplifying the DR with its comprehensive DRaaS approach. Whatever multi-cloud, multi-regional DR scenarios are created, the cloud ops team needs to make sure they are tested and maintained with any changes in the source production environment.


To learn more about cloud native technology innovation, join us at KubeCon + CloudNativeCon Europe 2021 - Virtual, which will take place from May 4-7.    


Faiz Khan Founder/CEO, Wanclouds

Faiz Khan 

Prior to founding Wanclouds, Faiz was an executive at Cisco and played multiple technology leadership roles. His latest assignment was leading the Global Cloud automation and orchestration organization. Prior to that, he has built the Global Datacenter and cloud practice and was the GM for Emerging Markets Technology Practices Organization.

Published Thursday, April 22, 2021 7:35 AM by David Marshall
Filed under: , ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<April 2021>