Tag Archives: Recovery

Disaster Recovery in the Public Cloud


Find out about the options for building highly available environments using public cloud providers, along with the benefits and tradeoffs.

I’ve had the opportunity to speak with many users about their plans for public cloud adoption; these discussions frequently revolve around how to avoid being impacted by potential cloud outages. Questions come up because public cloud outages do occur, even though they happen less frequently now than they may have in the past, and customers are concerned about mitigating the risk of disruption.

Thankfully, every major public cloud vendor offers options for building highly available environments that can survive some type of outage. AWS, for example, suggests four options that leverage multiple geographic regions. These options, which are also available with the other public cloud vendors, come with different price points and deliver different recovery point objectives (RPO) and different recovery time objectives (RTO).

 

Companies can choose the option that best meets their RPO/RTO requirements and budget. The key takeaway is that public cloud providers enable customers to build highly available solutions on their global infrastructure.

Let’s take a brief look at these options and review some basic principles for building highly available environments using the public cloud. I’ll use AWS for my examples, but the principles apply across all public cloud providers.

First, understand the recovery point objective (RPO) and recovery time objective (RTO) for each of your applications so you can design the right solution for each use case. Second, there’s no one-size-fits-all solution for leveraging multiple geographic regions. There are different approaches you can take depending on RPO, RTO, and the amount of cost you are willing and able to incur and the tradeoffs you are willing to make. Some of these approaches, using AWS as the example, include:

  • Recovering to another region from backups – Back up your environment to S3, including EBS snapshots, RDS snapshots, AMIs, and regular file backups. Since S3 only replicates data, by default, to availability zones within a single region, you’ll need to enable cross-region replication to your DR region. You’ll incur the cost of transferring and storing data in a second region but won’t incur compute, EBS, or database costs until you need to go live in your DR region. The trade-off is the time required to launch your applications.
  • Warm standby in another region – Replicate data to a second region where you’ll run a scaled-down version of your production environment. The scaled-down environment is always live and sized to run the minimal capacity needed to resume business. Use Route 53 to switch over to your DR region as needed. Scale up the environment to full capacity as needed. With this option, you get faster recovery, but incur higher costs.
  • Hot standby in another region – Replicate data to a second region where you run a full version of your production environment. The environment is always live, and invoking full DR involves switching traffic over using Route 53. You get even faster recovery, but also incur even higher costs.
  • Multi-region active/active solution – Data is synchronized between both regions and both regions are used to service requests. This is the most complex to set up and the most expensive. However, little or no downtime is suffered even when an entire region fails. While the approaches above are really DR solutions, this one is about building a true highly available solution.

One of the keys to a successful multi-region setup and DR process is to automate as much as possible. This includes backups, replication, and launching your applications. Leverage automation tools such Ansible and Terraform to capture the state of your environment and to automate launching of resources. Also, test repeatedly to ensure that you’re able to successfully recover from an availability zone or region failure. Test not only your tools, but your processes.

Obviously, much more can be said on this topic. If you are interested in learning more about disaster recovery in the cloud, you can see me in person at the upcoming Interop ITX 2018 in Las Vegas, where I will present, “Saving Your Bacon with the Cloud When Your Data Center Is on Fire.” 

Get live advice on networking, storage, and data center technologies to build the foundation to support software-driven IT and the cloud. Attend the Infrastructure Track at Interop ITX, April 30-May 4, 2018. Register now!

 



Source link

Disaster Recovery in the Public Cloud


Find out about the options for building highly available environments using public cloud providers, along with the benefits and tradeoffs.

I’ve had the opportunity to speak with many users about their plans for public cloud adoption; these discussions frequently revolve around how to avoid being impacted by potential cloud outages. Questions come up because public cloud outages do occur, even though they happen less frequently now than they may have in the past, and customers are concerned about mitigating the risk of disruption.

Thankfully, every major public cloud vendor offers options for building highly available environments that can survive some type of outage. AWS, for example, suggests four options that leverage multiple geographic regions. These options, which are also available with the other public cloud vendors, come with different price points and deliver different recovery point objectives (RPO) and different recovery time objectives (RTO).

 

Companies can choose the option that best meets their RPO/RTO requirements and budget. The key takeaway is that public cloud providers enable customers to build highly available solutions on their global infrastructure.

Let’s take a brief look at these options and review some basic principles for building highly available environments using the public cloud. I’ll use AWS for my examples, but the principles apply across all public cloud providers.

First, understand the recovery point objective (RPO) and recovery time objective (RTO) for each of your applications so you can design the right solution for each use case. Second, there’s no one-size-fits-all solution for leveraging multiple geographic regions. There are different approaches you can take depending on RPO, RTO, and the amount of cost you are willing and able to incur and the tradeoffs you are willing to make. Some of these approaches, using AWS as the example, include:

  • Recovering to another region from backups – Back up your environment to S3, including EBS snapshots, RDS snapshots, AMIs, and regular file backups. Since S3 only replicates data, by default, to availability zones within a single region, you’ll need to enable cross-region replication to your DR region. You’ll incur the cost of transferring and storing data in a second region but won’t incur compute, EBS, or database costs until you need to go live in your DR region. The trade-off is the time required to launch your applications.
  • Warm standby in another region – Replicate data to a second region where you’ll run a scaled-down version of your production environment. The scaled-down environment is always live and sized to run the minimal capacity needed to resume business. Use Route 53 to switch over to your DR region as needed. Scale up the environment to full capacity as needed. With this option, you get faster recovery, but incur higher costs.
  • Hot standby in another region – Replicate data to a second region where you run a full version of your production environment. The environment is always live, and invoking full DR involves switching traffic over using Route 53. You get even faster recovery, but also incur even higher costs.
  • Multi-region active/active solution – Data is synchronized between both regions and both regions are used to service requests. This is the most complex to set up and the most expensive. However, little or no downtime is suffered even when an entire region fails. While the approaches above are really DR solutions, this one is about building a true highly available solution.

One of the keys to a successful multi-region setup and DR process is to automate as much as possible. This includes backups, replication, and launching your applications. Leverage automation tools such Ansible and Terraform to capture the state of your environment and to automate launching of resources. Also, test repeatedly to ensure that you’re able to successfully recover from an availability zone or region failure. Test not only your tools, but your processes.

Obviously, much more can be said on this topic. If you are interested in learning more about disaster recovery in the cloud, you can see me in person at the upcoming Interop ITX 2018 in Las Vegas, where I will present, “Saving Your Bacon with the Cloud When Your Data Center Is on Fire.” 

Get live advice on networking, storage, and data center technologies to build the foundation to support software-driven IT and the cloud. Attend the Infrastructure Track at Interop ITX, April 30-May 4, 2018. Register now!

 



Source link

5 Hot Enterprise Backup and Recovery Vendors


The backup and recovery market has become a crowded space, with hundreds of vendors vying for market share. At the higher end of the market, the enterprise data center segment, the bar is higher and the result is that just a handful of software vendors command most of the sales.

With most tape drive vendors exiting the market, support of other backup media has become essential to maintaining a vendor’s business. Most initially pushed for hard disk-based backup, but the latest trend is to offer cloud storage solutions as well.

In what had become a somewhat stale and undifferentiated market, both HDD/SSD and cloud opened up new opportunities and something of a “space race” has occurred in the industry over the last few years. Backup and recovery vendors have added compression and deduplication, which can radically reduce the size of a typical backup image. This is important when data is moved to a remote storage site via WAN links, since these have lagged well behind compute horsepower and LAN bandwidth.

Many backup and recovery packages create a backup gateway that stores the backup at LAN speeds and then send it off across the WAN at a more leisurely pace. The benefit is a reduced backup window, though with some risk of data loss if the backup is corrupted prior to completing the move to the remote site.

Today, the target of choice for backup data is the cloud. It’s secure, very scalable and new low-traffic services cost very little to rent. The backup gateway encrypts all data so backups are hack-proof, though not necessarily deletion-proof, which requires action by the cloud service provider to provide storage types with only a well-protected manual deletion path.

Continuous data protection (CDP) is one of the hot backup services today; it manifests as either server-side snapshots or high-frequency polling by backup software for changed objects. Using these approaches reduces the data loss window, though it can hurt performance. SSDs help solve most of the performance issues, but daytime WAN traffic will increase.

Noting that access to backup storage tends to occur within just a few hours of the backup itself, some of the newcomers to the space offer a caching function, where data already moved to the remote site is held in the backup gateway for a couple of days. This speeds recovery of cached files.

With applications such as Salesforce, MS Office and Exchange common in the enterprise, optimizations capabilities to enable backup without disrupting operations are common features among the main players in datacenter backup. Many vendors also now offer backup for virtual machines and their contents and container backup will no doubt become common as well.

There is a school of thought that says that continuous snapshots, with replicas stored in the cloud, solve both backup and disaster recovery requirements, but there are issues with this concept of perpetual storage, not least of which is that a hacker could delete both primary data and the backups. Not paying your cloud invoice on time can do that, too! The idea is attractive, however, since license fees for software mostly disappear.

Readers are likely familiar with “old-guard” established backup and recovery vendors such as Veritas, Commvault, Dell EMC, and IBM. In this slideshow, we look at five of up-and-coming vendors, in alphabetical order, that are driving innovation in enterprise backup and recovery.

(Image: deepadesigns/Shutterstock)



Source link

5 Disaster Recovery Tips: Learning from Hurricanes


Hurricanes Irma and Harvey highlight the need for DR planning to ensure business continuity.

 

This has been an awful year for natural disasters, and yet, we’re not even midway through a hurricane season that’s been particularly devastating. Hurricanes Irma and Harvey, and the flooding that ensued, has resulted in loss of life, extensive property damage, and crippled infrastructure..

Naturally, businesses have also been impacted. When it comes to applications, data and data centers, this is a wake-up call. At the same time, these are situations that motivate companies and individuals to introduce much-needed change. With this in mind, I’ll offer five tips any IT organization can use to become more resilient against natural disaster, no matter the characteristics of their systems and data centers. This can lead to better availability of critical data and tools when disaster strikes, continuity in serving customers, as well as peace of mind knowing preparations have been made and work can continue as expected.

1. Keep your people safe

When a natural disaster is anticipated (if there is notice), IT staffers need to focus on personal and family safety issues. Having to work late to take one more backup off-site shouldn’t be part of the last-minute process. Simply put, no data is worth putting lives at risk. If the rest of these tips are followed, IT staff won’t have to scramble in the heavy push of preparation to tie up loose ends of what already should be a resilient IT strategy.

2. Follow the 3-2-1 rule

In my role, I’ve long advocated the 3-2-1 rule, and we need to keep reiterating it: Have three different copies of important data saved, on two different media, one of these being off-site. Embrace this rule if you haven’t already. There are two additional key benefits of the 3-2-1 rule: It doesn’t require any specific technology and can address nearly any failure scenario.

3. 10 miles may not be enough

My third tip pertains to the off-site recommendation above. Many organizations believe the off-site copy or disaster recovery facility should be at least 10 miles away. This no longer may be sufficient; the path and fallout of a hurricane can be wide-reaching. Moreover, you want to avoid having personnel spend unnecessary time in a car traveling to complete the IT work. Cloud technologies can provide a more efficient and safer solution. This can involve using disaster recovery as a service (DRaaS) from a service provider or simply putting backups in the cloud.

4. Test your DR plan

Ensure that when a disaster plan is created there is particular focus on anticipating and eliminating surprises. This should involve regularly testing of backups to be certain they are completely recoverable, that the plan will function as expected and all data is where it needs to be (off-site, for example). The last thing you want during a disaster is to find that the plan hasn’t been completely implemented or run in months, or worse, discover there are workloads which are not recoverable.

5. Communications planning

My final recommendation is to work backwards in all required systems and with providers of all types to ensure you don’t have risks you can’t fix. Pay close attention to geography in relation to your own facilities, as well as country locations for data sovereignty considerations. This can apply to telecommunications providers, too. A critical component about response to any disaster is that organizations are able to communicate. Given what has happened in some locations in the path of Hurricane Irma, even cellular communication can be unreliable. Consider developing a plan to ensure communications in the interim if key business systems are down.

The recent flood and hurricane damage has been significant. The truth is, when it comes to the data, IT services, and more, there is a significant risk a business may never recover if it’s not adequately prepared. We live in a digitally transformed world and many businesses can’t operate without the availability of systems and data. These simple tips can bring about the resiliency companies need to effectively handle disasters, and prove their reliability to the customers they serve.

Rick Vanover is director of technical product marketing for Veeam Software.



Source link

5 Disaster Recovery Tips: Learning from Hurricanes


Hurricanes Irma and Harvey highlight the need for DR planning to ensure business continuity.

 

This has been an awful year for natural disasters, and yet, we’re not even midway through a hurricane season that’s been particularly devastating. Hurricanes Irma and Harvey, and the flooding that ensued, has resulted in loss of life, extensive property damage, and crippled infrastructure..

Naturally, businesses have also been impacted. When it comes to applications, data and data centers, this is a wake-up call. At the same time, these are situations that motivate companies and individuals to introduce much-needed change. With this in mind, I’ll offer five tips any IT organization can use to become more resilient against natural disaster, no matter the characteristics of their systems and data centers. This can lead to better availability of critical data and tools when disaster strikes, continuity in serving customers, as well as peace of mind knowing preparations have been made and work can continue as expected.

1. Keep your people safe

When a natural disaster is anticipated (if there is notice), IT staffers need to focus on personal and family safety issues. Having to work late to take one more backup off-site shouldn’t be part of the last-minute process. Simply put, no data is worth putting lives at risk. If the rest of these tips are followed, IT staff won’t have to scramble in the heavy push of preparation to tie up loose ends of what already should be a resilient IT strategy.

2. Follow the 3-2-1 rule

In my role, I’ve long advocated the 3-2-1 rule, and we need to keep reiterating it: Have three different copies of important data saved, on two different media, one of these being off-site. Embrace this rule if you haven’t already. There are two additional key benefits of the 3-2-1 rule: It doesn’t require any specific technology and can address nearly any failure scenario.

3. 10 miles may not be enough

My third tip pertains to the off-site recommendation above. Many organizations believe the off-site copy or disaster recovery facility should be at least 10 miles away. This no longer may be sufficient; the path and fallout of a hurricane can be wide-reaching. Moreover, you want to avoid having personnel spend unnecessary time in a car traveling to complete the IT work. Cloud technologies can provide a more efficient and safer solution. This can involve using disaster recovery as a service (DRaaS) from a service provider or simply putting backups in the cloud.

4. Test your DR plan

Ensure that when a disaster plan is created there is particular focus on anticipating and eliminating surprises. This should involve regularly testing of backups to be certain they are completely recoverable, that the plan will function as expected and all data is where it needs to be (off-site, for example). The last thing you want during a disaster is to find that the plan hasn’t been completely implemented or run in months, or worse, discover there are workloads which are not recoverable.

5. Communications planning

My final recommendation is to work backwards in all required systems and with providers of all types to ensure you don’t have risks you can’t fix. Pay close attention to geography in relation to your own facilities, as well as country locations for data sovereignty considerations. This can apply to telecommunications providers, too. A critical component about response to any disaster is that organizations are able to communicate. Given what has happened in some locations in the path of Hurricane Irma, even cellular communication can be unreliable. Consider developing a plan to ensure communications in the interim if key business systems are down.

The recent flood and hurricane damage has been significant. The truth is, when it comes to the data, IT services, and more, there is a significant risk a business may never recover if it’s not adequately prepared. We live in a digitally transformed world and many businesses can’t operate without the availability of systems and data. These simple tips can bring about the resiliency companies need to effectively handle disasters, and prove their reliability to the customers they serve.

Rick Vanover is director of technical product marketing for Veeam Software.



Source link