Article Written by Lei Yang, Sr. Product & Solutions Marketing Manager, Tintri
Thinking
cyber attacks are distant? The unprecedented global spread of WannaCry
should have provided ample proof of just how dangerous and costly it can
be. The attack affected organizations in more than 150 countries and
more than 300,000 Windows PCs, hitting high profile organizations such
as the UK's National Health Service, the Ministry of internal Affairs in
Russia, FedEx, Nissan and Hitachi.
Such
a high profile incident should deliver a wake-up call for organizations
and spur them to investigate if their current data protection and
disaster recovery (DP/DR) strategy is ready to handle this level of
risk. In simple terms, businesses need to ensure that in the event of
such an attack, they are well-positioned to be back up and running as
quickly as possible, that they minimize any data loss or customer
experience disruptions and protect against future infection.
Getting Backup Right
In
the event of an attack, the first port of call in trying to rescue the
situation is the company's data backup. Assuming the organization has
the proper protections in place to safeguard the backup from being
destroyed by the ransomware, it needs to ensure that the backup regime
is fit for purpose. For example, it may only be in the aftermath of a
ransomware attack that a company discovers the last good backup is 24
hours old because it has set a daily recovery point objective (RPO). In
addition, retrieving that data may take up quite a lot of time, enough
time in fact that the recovery time objective (RTO) for the backup could
be as much as two days. Hardly ideal.
So
what measures should an organizations take to reduce the RPO and RTO
for their backup regimes? One solution would be to look for backup
products that provide snapshots for point-in-time recovery, locally and
remotely, through native replication. This would significantly reduce
RPO down to 15 minutes with asynchronous replication or zero RPO and
near zero RTO with synchronous replication. For virtual machines (VMs),
it should be possible to rapidly restore the OS to the last usable
point-in-time, reducing the downtime caused by affected VMs.
The Right Recovery
In
most LUN-based storage, the best an organization can achieve is to
recover all the applications on the same LUN to the same point-in-time.
But as ransomware typically comes in waves and spreads throughout
systems over time, the rate of infection for different applications on
the same LUN can vary widely. For instance, the first attack might only
affect 10% of VMs on the LUN, an equal number might be hit in a second
attack several hours later and the rest might be completely unaffected.
Forcing the recovery of the entire LUN means unaffected VMs will be
forcibly recovered to the point-in-time necessary for the earliest
affected VMs. This results in a completely unbalanced approach where 90%
of VMs are unnecessarily recovered to the earlier point-in-time for the
sake of the 10% that were hit in the first attack.
What
is required is a solution that provides the granularity of recovery on
VMs with different levels of infection by providing the capability to
recover on a per-VM basis. This enables organizations to restore the
affected VMs to the right point-in-time for them.
Enable Time Travel
The
other issue to be aware of is that most backups limit businesses to
restoring to a specific point in time. Essentially, they are given a
one-way ticket to the specified backup point and lose the ability to
restore to any snapshots that happened after that point. With
ransomware, it can be hard for organizations to pinpoint the moment when
the attacks started to affect their VMs. They may end up restoring to a
point well before they were infected by the ransomware. With more
modern storage systems, it is possible to move back and forth between
recovery points to gain a more accurate view of when VMs were affected
and restore more accurately.
If
organizations deploy a modern and effective data backup and recovery
strategy that achieves faster RPO and RTO, ensures faster recovery and
gets services up and running with VM-level granularity, there is no
reason why they should WannaCry after a ransomware attack.