In a recent analyst survey, 76% of respondents reported an incident during the past two years that required an IT disaster recovery (DR) plan, while more than 50% reported at least two incidents1. Further, the average cost of a data breach reached $4.3 million per incident in 2021; the highest in almost 17 years2. It is therefore not surprising that DR is a moving up the list of IT priorities.
It’s worth considering the two sides of disaster recovery planning; the first is prevention activities. These tend to receive the most focus from sponsors and can include architecting for resilience, removal of single points of failure, security management and operational controls. However – much time and effort is spent on tuning these areas, there always remains a residual risk that a failure will occur, and it is at this point that disaster recovery fulfils its role of solution of last resort. With this in mind, perhaps disaster recovery should be considered as the most important to implement early and to re-evaluate first as business needs evolve.
Key use cases
Following the move to remote working under COVID, business continuity and disaster recovery plans are coming under closer scrutiny from boards and auditors than before. There are multiple reasons why businesses would want to revisit their current disaster recovery (DR) strategy in favour of Disaster Recovery as a Service (DRaaS), the main use cases being:
Modernise: Replace or refresh an existing DR solution
This is for organisations that are struggling with expensive and complex DR solutions. Further, some organisations already have an on-premises DR solution, but they use it only to protect a few workloads. With DRaaS, these customers can protect the rest of their workloads to the cloud, while keeping their existing DR plans unchanged.
Some organisations have a mandate to reduce their on-premises footprint or “move to the cloud” and DRaaS is a natural solution to move the on-premises DR site to the cloud.
Optimise: Driving DR cost optimisation
Not all applications are created equal and SLA requirements may differ depending on the tier of application. Organisations need to ensure that their DR costs are optimised, and SLAs are tuned to match the application requirements. Cloud disaster recovery allows you to leverage cloud economics to deliver on demand failover capacity to meet DR needs rather than investing in a data centre.
Restore: Avoid data loss
DR between different cloud regions – Even the largest public clouds have outages, making DR relevant to customers who are running apps in the cloud, as customers using DRaaS solutions can protect their applications between different cloud regions.
Ransomware recovery – Ransomware protection needs to be company wide, ranging from best practices for employees to having a tried and tested recovery plan should disaster occur.
Challenges of Disaster Recovery
Organisations may be aware that they need a disaster recovery strategy refresh, but implementing this is complex. Because data and applications are critical assets, organisations commit significant resources to make them highly available, including preparing for a full site failure by creating a disaster recovery plan.
Setting up a comprehensive DR solution is intricate, unreliable and expensive. Solutions often require significant and time-consuming manual effort. Furthermore, as applications evolve and data grows, organisations run into challenges scaling their DR solutions and ensuring their reliability.
69% of IT decision makers lack the confidence that they could reliably recover all business-critical data in the event of a cyber attack3
Explore how DRaaS solutions simplify and address common VMware disaster recovery challenges
Disaster Recovery solutions
There are a variety of DR solutions on the market:
1. Data backup only
These solutions replicate organisations’ data to a second on-premises site or to the cloud.
However, they leave organisations exposed to long down-times during disasters since applications do not have infrastructure to run on. They also do not include simple DR testing, and require extensive manual work once infrastructure is acquired.
2. Automated DR to an on-premises site or to a co-location
Solutions of this kind reduce the amount of manual effort, but still require high capital investments in physical premises, hardware, and software that are not often used. Additionally, they are more difficult to scale.
3. Automated DR to data centres owned by DRaaS providers
These solutions provide most of the benefits of automated DR to an on-premises site and have a better cost structure that reflects the infrequent use of the DR target. Customers of these solutions should assess the reliability of the DR infrastructure and the financial stability of the DRaaS provider.
4. Automated DR to a global mega-cloud
These solutions provide most of the benefits of automated DR to an on-premises site, and have a better cost structure that reflects the
infrequent use of the DR target. Customers of these solutions benefit from
reduced risk thanks to the reliable infrastructure, global availability, and
financial stability of the mega-cloud provider. However, some of these solutions require re-platforming of customers applications.
Arrange a call back from one of our industry leading consultants to discuss the next steps around your networking and security strategy.
Key factors to take into account when considering
a Disaster Recovery requirement
1. Identifying the RTOs required by different applications
While organisations can certainly protect all their applications, doing so can be very expensive. Instead, organisations should categorise their applications based on their Recovery Time Objectives (RTOs), which is the acceptable amount of waiting time before an application comes back online. Some DRaaS solutions have RTOs ranging in minutes, others ranging in hours, and others in days.
2. Determining if the service offers failover automation and orchestration
It is relatively easy to backup data to the cloud, however, simply relying on backup alone exposes organisations to significant risks in the case of a disaster. If only data is copied to the cloud, organisations are left with the task of setting up a full environment, spinning up compute instances, moving data to the right cloud storage service, and setting up networking. Many of these tasks are highly manual and require significant time to execute. For applications with an RTO of two days or more, that is not an issue. However, for revenue-generating applications, that is typically too long.
For more critical applications, organisations should choose cloud-based services that offer DR failover orchestration and automation. Such services deploy a DR environment based on a pre-defined runbook. They spin up the required nodes, power on VMs in the correct sequence according to the right dependencies, run scripts, and map IP networks automatically, all with very little human intervention. This ensures that critical applications can be powered up in time and minimises any business impact of a disaster.
3. Evaluating the level of complication to re-platform applications
Many modern microservices-based applications are typically agnostic when it comes to which public cloud they run on. However, traditional applications, which are still very dominant in many organisations, are typically deployed as VMs. Different hypervisors have different VM formats, and many public clouds do not have the same VM formats as organisations’ on-premises VMs. In order for applications to be written and deployed on one hypervisor to be used on another hypervisor, the VMs have to be re-platformed. This is typically a long and complicated process, and organisations can spend many months in the process. More importantly, during the re-platforming process, an organisation’s applications are not protected in the case of a disaster.
4. The need to run non-disruptive DR tests
Creating a DR plan is not a one-time activity. Data centres are not static – existing applications get updated or replaced, and more applications are added over time. This results in a drift between an organisation’s original DR plan and an effective DR plan that can keep up-to-date with the changing applications. In order to make sure this situation does not occur, organisations need to test their DR plan often, with best practices suggesting at least once per quarter. Since these tests are not real disasters, they should not affect an organisation’s current running applications. In other words, these tests need to be non-disruptive.
Furthermore, some organisations are required by law to perform DR tests and to present the results in an audit. A good DRaaS should offer customers extensive non-disruptive testing and provide detailed reports generated by these tests.
5. Ensuring the reliability of the DR site infrastructure
Finally, many vendors offer Disaster Recovery as a Service. The scale and sophistication of DRaaS providers varies. Many vendors often lack the scale, reliability, financial stability, and global availability of the major cloud providers. Since organisations need to rely on DR solutions at critical times, when their main data centres are down, the reliability of the DR infrastructure is a key factor to consider.
VMware Cloud Disaster Recovery is VMware’s DRaaS solution and spins up VMware Cloud on AWS infrastructure only during DR testing or failover event. It utilises a highly efficient cloud storage layer for storing backups and lowering DR costs. Failbacks result in minimal AWS egress charges because only data deltas/changes are transferred.
Conclusion
As organisations turn to public clouds for Disaster Recovery as a Service they should consider the various factors surrounding their DR strategy. A robust DR offering should be able to provide the RTOs needed for business-critical applications. It should also offer orchestration and automation of the failover process, and non-disruptive testing. All this should ideally be done without the need to re-platform applications and run on top of a reliable public cloud.
FAQs
Which VMware solution is best for my business’s disaster recovery needs?
VMware provides several solutions to meet your DR needs. You can pick the one that best matches the criticality, RTO, and RPO requirements for each of your applications and workloads, as well as meets your organisational policies.
- Site Recovery Manager (SRM) is a powerful solution for organisations who want to utilise a secondary data centre as their DR site.
- VMware Cloud Disaster Recovery offers on-demand DRaaS to protect a broad set of IT services in a cost-efficient manner, with fast recovery capabilities.
- VMware Site Recovery delivers hot DRaaS for mission critical IT services that require very low RPO and RTO and all the benefits of vSphere Replication and SRM.
What is VMware Cloud Disaster Recovery?
What is Site Recovery Manager?
Site Recovery Manager (SRM) is the industry-leading disaster recovery (DR) management solution, designed to minimise downtime in case of a disaster. It provides policy-based management, automated orchestration, and non-disruptive testing of centralised recovery plans. It is designed for virtual machines and scalable to manage all applications in a vSphere environment.
What is the difference between VMware Cloud Disaster Recovery, VMware Site Recovery, and VMware Site Recovery Manager?
VMware Cloud Disaster Recovery is a Disaster Recovery as-a-service (DRaaS) solution that can be used to cost-effectively protect a broad set of your virtualised applications, with fast recovery capabilities. VMware Site Recovery is also a DRaaS solution that can be used to protect mission-critical applications that have a very low RPO and RTO. VMware Site Recovery Manager is an enterprise software solution, deployed and managed by you in your data centre to facilitate DR protection to a secondary DR data centre that you manage yourself.
Can I use multiple VMware Disaster Recovery solutions concurrently?
Yes, you can use multiple disaster recovery solutions to protect all of your workloads. Each can be used to protect a different set of your IT applications.
Why are organisations increasingly choosing Disaster Recovery as a Service (DRaaS)?
DRaaS focuses on the most common challenges of DR – complexity, high cost, and poor reliability – without the need to manage and maintain your own DR site. VMware Cloud Disaster Recovery is VMware’s newest DR solution for this. It utilises cost-efficient cloud storage, offers easy to use SaaS-based management, includes continuous DR health check and audit reports, and gives customers a pay-as-you-need capacity model.
[1] Gartner, Inc. “Survey Analysis: IT Disaster Recovery Trends and Benchmarks.” Jerry Rozeman, Ron Blair. April 30, 2020
[2] https://newsroom.ibm.com/2021-07-28-IBM-Report-Cost-of-a-Data-Breach-Hits-Record-High-During-Pandemic
[3] Global data protection Index survey 2020 snapshot
Discuss the next steps around your networking and security strategy
Learn which DR technology is right for you
Learn more about VMware Cloud Disaster Recovery
View our Networking & Security page